(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



m ^-^s;^ : i iiiiiiiiii 111 «ii i »w> i:h 

(43) International Publication Date (10) International Publication Number 

24 October 2002 (24.10.2002) PCT WO 02/083855 A2 



(51) International Patent Classification 7 : C12N 

(21) International Application Number: PC/IYUS02/1 1 524 

(22) International Filing Date: 12 April 2002 (12.04.2002) 



[US/US]; 9 Glenley Terrace, Brighton, MA 



(25) Filing Language: 

(26) Publication Language: 

(30) Priority Data: 

60/283,948 
60/284,443 



16 April 2001 (16.04.2001) US 
18 April 2001 (18.04.2001) US 



(71) Applicant (for all designated States except US): AMER- 
ICAN CYANAMID COMPANY [US/US]; Five Giralda 

i Farms, Madison, NJ 07940 (US). 

| (72) Inventors; and 

j (75) Inventors/Applicants (for US only): ZAGURSKY, 

| Robert, John [US/US]; 569 Fox Hunt Drive, Victor, NY 

| 14564 (US). MASI, Amy, Wadhams [US/US]; 326 Grand 

i Circle, Caledonia, NY 14423 (US). GREEN, Bruce, 

j Arthur [US/US]; 49 Northfield Gate, Pittsford, NY 

i 14534 (US). CHAKRAVARTI, Deb, Narayan [IN/US]; 2 

j Fairway Crossing, Pittsford, NY 14534 (US). RUSSELL, 

■ David, Parrish [US/US]; 240 North Pleasant Street, 

{ Canandaigua, NY 14424 (US). WOOTERS, Joseph, 



(74) Agents: BRAZIL, Bill, T; Wyeth, Patent Law Depart- 
ment, Five Giralda Farms, Madison, NJ 07940 et al. (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 

AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FI, GB, GD, GE, GH, 
GM, HR, HU. ID, IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, OM, PH, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TN, TR, TT, TZ, UA, UG, US, UZ, 
VN, YU, ZA, ZM, ZW. 



(84) Designated States (regional): ARIPO patent (Gil. GM. 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZM, ZW), 
Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, T.T, TM), 
European patent (AT, BE, CH, CY, DE, DK, ES, FI, FR, 
GB, GR, IE, IT, LU, MC, NL, PT, SE, TR), OAPI patent 
(BR BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, MR, 
NE, SN, TD, TG). 

Published: 

— without international search report, and to be republished 
upon receipt of that report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations " appearing al the begin- 
ning of each regular issue of the PCT Gazette. 



(54) Title: NOVEL STREPTOCOCCUS PNEUMONIAE OPEN READING FRAMES ENCODING POLYPEPTIDE ANTIGENS 
AND USES THEREOF 

\f} (57) Abstract: The present invention relates to newly identified open reading frames comprised within the genomic nucleotide 
tin sequence of Streptococcus pneumoniae, wherein the open reading frames encode polypeptides that are surface localized on Strepto- 
00 coccus pneumoniae. Thus, the invention relates to Streptococcus pneumoniae open reading frames that encode polypeptide antigens, 
^2 polypeptides, preferably antigenic polypeptides, encoded by the Streptococcus pneumoniae open reading frames, vectors comprising 

open reading frame sequences and cells or animals transformed with these vectors. The invention relates also to methods of dctcct- 

ing these nucleic acids or polypeptides and kits for diag i i in. cc i tt nmae infection. The invention finally relates to 

^ pharmaceutical compositions, in particular immunogenic compositions, for the prevention and/or treatment of bacterial infection, in 

particular infections with Streptococcus pneumoniae. In particular embodiments, the immunogenic compositions are used for the 
Q treatment or prevention of systemic diseases which are induced or exacerbated by Streptococcus pneumoniae. In other embodiments, 

the immunogenic compositions are used for the treatment or prevention of non-systemic diseases, particularly of the otitis media, 

which are induced or exacerbated by Streptococcus pneumoniae. 



WO 02/083855 



PCT7US02/11524 



5 NOVEL STREPTOCOCCUS PNEUMONIAE OPEN READING FRAMES 

ENCODING POLYPEPTIDE ANTIGENS 
AND USES THEREOF 



This application claims priority from copending provisional application serial 
10 number 60/283,948, filed on April 16, 2001, the entire disclosure of which is hereby 
incorporated by reference and provisional application serial number 60/284,443, filed 
April 18, 2001, the entire disclosure of which is hereby incorporated by reference. 



Field of the Invention 

15 The invention relates to Streptococcus pneumoniae genomic sequence and 

polynucleotide sequences encoding polypeptides of Streptococcus pneumoniae. 
More particularly, the invention relates to newly identified polynucleotide open 
reading frames comprised within the genomic nucleotide sequence of Streptococcus 
pneumoniae, wherein the open reading frames encode Steptococcus pneumoniae 

20 polypeptides, preferably polypeptides that are surface localized, secreted, membrane 
associated or exposed on Streptococcus pneumoniae. 



Background of the Invention 

Streptococcus pneumoniae infections are a major cause of human diseases 
25 such as otitis media, bacteremia, meningitis, septic arthritis and fatal pneumonia 
worldwide (Butler et ai, 1999; James and Thomas, 2000). Over the past 10-20 
years, Streptococcus pneumoniae has developed resistance to most antibiotics used 
for its treatment. In fact, it is common for Streptococcus pneumoniae to become 
resistant to more than one class of antibiotic, e.g., p-lactams, macrolides, 
30 lincosamides, trimethoprim-sulfamethoxazole, tetracyclines (Tauber, 2000), meaning 
Streptococcus pneumoniae treatment is becoming more difficult. 

Thus, the rapid emergence of multi-drug resistant pneumococcal strains 
throughout the world has led to increased emphasis on prevention of pneumococcal 
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infections by immunization (Goldstein and Garau, 1997). The currently available 23- 
valent pneumococcal capsular polysaccharide vaccine, is not effective in children of 
less than 2 years of age or in immunocompromised patients, two of the major 
populations at risk from pneumococcal infection (Douglas et al., 1983). A 7-valent 
5 pneumococcal polysaccharide-protein conjugate vaccine, recently licensed in the 
United States, was shown to be highly effective in infants and children against 
systemic pneumococcal disease caused by the vaccine serotypes and against cross- 
reactive capsular serotypes (Shinefieid and Black, 2000). The seven capsular types 
cover greater than 80% of the invasive disease isolates in children in the United 

10 States, but only 57-60% of disease isolates in other areas of the world (Hausdorff et 
al., 2000). There is therefore an immediate need for a cost-effective vaccine to cover 
most or all of the disease causing serotypes of pneumococci. While this can be 
achieved by adding conjugates covering additional serotypes, efforts continue to find 
non-capsular vaccine antigens that are conserved among all pneumococcal 

15 serotypes and effective against pneumococcal disease. 

Protein antigens of Streptococcus pneumoniae have been evaluated for 
protective efficacy in animal models of pneumococcal infection. Some of the most 
commonly studied candidate antigens include the PspA proteins, PsaA lipoprotein, 
and the CbpA protein. Numerous studies have shown that PspA protein is a 

20 virulence factor (Crain et al., 1990; McDaniel et al., 1984) but it is antigenically 
variable among pneumococcal strains. A recent study has indicated that some 
antigenically conserved regions of a recombinant PspA variant may elicit cross- 
reactive antibodies in human adults (Nabors et al., 2000). PsaA, a 37 kD lipoprotein 
with similarity to other gram-positive adhesins, is involved in Mn + transport in 

25 pneumococci (Sampson et al., 1994; Dintilhac et al., 1997) and has also been shown 
to be protective in mouse models of systemic disease (Talkington et al., 1996). The 
surface exposed choline binding protein CbpA is antigenically conserved and 
protective in mouse models of pneumococcal disease (Rosenow et al., 1997). Since 
nasopharyngeal colonization is a prerequisite for otic disease, intranasal 

30 immunization of mice with pneumococcal proteins and appropriate mucosal 
adjuvants has been used to enhance the mucosal antibody response and thus, the 
effectiveness of candidate antigens (Yamamoto et al., 1998; Briles et al., 2000). 
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While the PspA protein, PsaA lipoprotein and the CbpA protein antigens 
appear promising, it is possible that no one protein antigen will be effective against all 
Streptococcus pneumoniae serotypes. Laboratories therefore continue to search for 
additional candidates that are antigenically conserved and elicit antibodies that 
5 reduce colonization (important for otitis media), are protective against systemic 
disease, or both. Thus, there is an immediate need for a cost-effective vaccine to 
cover most or all of the disease causing serotypes of Streptococcus pneumoniae and 
methods of diagnosing Streptococcus pneumoniae infection. A better understanding 
of the genetic and molecular levels of Streptococcus pneumoniae infection will 
10 provide the basis for further development of preventative treatments, therapeutic 
treatments, new diagnostics and vaccine strategies which are specific for 
Streptococcus pneumoniae. 

Summary of the Invention 

15 The present invention broadly relates to Streptococcus pneumoniae genomic 

sequence. More particularly, the invention relates to newly identified polynucleotide 
open reading frames comprised within the genomic nucleotide sequence of 
Streptococcus pneumoniae, wherein the open reading frames encode polypeptides 
that are surface localized, membrane associated, secreted, or exposed on 

20 Streptococcus pneumoniae. 

Thus, in certain aspects, the invention relates to Streptococcus pneumoniae 
open reading frames that encode Streptococcus pneumoniae polypeptides. In 
preferred embodiments, these Streptococcus pneumoniae polypeptides are antigenic 
polypeptides. As defined hereinafter, a Streptococcus pneumoniae antigenic 

25 polypeptide, antigen or immunogen, is a Streptococcus pneumoniae polypeptide that 
is immunoreactive with an antibody or is a Streptococcus pneumoniae polypeptide 
that elicits an immune response. In other embodiments, the invention relates to the 
polynucleotides encoding these antigenic polypeptides. In other aspects, the 
invention relates to vectors comprising open reading frame sequences and cells or 

30 animals transformed, transfected or infected with these vectors. The invention 
relates also to methods of detecting these nucleic acids or polypeptides and kits for 
diagnosing Streptococcus pneumoniae infection. The invention further relates to 
pharmaceutical compositions, in particular immunogenic compositions, for the 
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prevention and/or treatment of bacterial infection, in particular infections with 
Streptococcus pneumoniae. In a preferred embodiment, the immunogenic 
compositions are used for the treatment or prevention of systemic diseases that are 
induced or worsened by Streptococcus pneumoniae. In another preferred 
5 embodiment, the immunogenic compositions are used for the treatment or prevention 
of non-systemic diseases, particularly of the otitis media, which are induced or 
worsened by Streptococcus pneumoniae. 

In particular embodiments, an isolated polynucleotide of the present invention 
is a polynucleotide comprising a nucleotide sequence having at least about 95% 

10 identity to a nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID 
NO: 215 or SEQ ID NO:431 through SEQ ID NO:591, a degenerate variant thereof, 
or a fragment thereof. As defined hereinafter, a "degenerate variant" is defined as a 
polynucleotide that differs from the nucleotide sequence shown in SEQ ID NO:1 
through SEQ ID NO:215 and SEQ ID NO:431 through SEQ ID NO:591 (and 

15 fragments thereof) due to degeneracy of the genetic code, but still encodes the same 
Streptococcus pneumoniae polypeptide (i.e., SEQ ID NO:216 through SEQ ID 
NO:430 and SEQ ID NO:592 through SEQ ID N0.752) as that encoded by the 
nucleotide sequence shown in SEQ ID NO:1 through SEQ ID NO:215 and SEQ ID 
NO:431 through SEQ ID NO:591. 

20 In other embodiments, the polynucleotide is a complement to a nucleotide 

sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 
NO:431 through SEQ ID NO:591, a degenerate variant thereof, or a fragment 
thereof. In yet other embodiments, the polynucleotide is selected from the group 
consisting of DNA, chromosomal DNA, cDNA and RNA and may further comprise 

25 heterologous nucleotides. 

In another embodiment, the invention comprises an isolated polynucleotide 
that hybridizes to a nucleotide sequence chosen from one of SEQ ID NO: 1 through 
SEQ ID NO: 215 or SEQ ID NO:431 through SEQ ID NO:591 , a complement thereof, 
a degenerate variant thereof, or a fragment thereof, under high stringency 

30 hybridization conditions. In yet other embodiments, the polynucleotide hybridizes 
under intermediate stringency hybridization conditions. 

In a preferred embodiment, an isolated polynucleotide of a Streptococcus 
pneumoniae genomic sequence comprises a nucleotide sequence chosen from one 
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of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO:431 through SEQ ID 
NO-.591, a fragment thereof, or a degenerate variant thereof, and encodes a 
polypeptide, a biological equivalent thereof, or a fragment thereof, selected from the 
group consisting of a Streptococcus pneumoniae polypeptide having 0, 1 or 2 
5 transmembrane domains, a Streptococcus pneumoniae polypeptide having 3 or more 
transmembrane domains, a Streptococcus pneumoniae polypeptide having an outer 
membrane domain or a periplasmic domain, a Streptococcus pneumoniae 
polypeptide having an inner membrane domain, a Streptococcus pneumoniae 
polypeptide identified by Blastp analysis, a Streptococcus pneumoniae polypeptide 

10 identified by Pfam analysis, a Streptococcus pneumoniae lipoprotein, a 
Streptococcus pneumoniae polypeptide having a LPXTG motif, wherein the 
polypeptide is covalently attached to the peptidoglycan layer, a Streptococcus 
pneumoniae polypeptide having a peptidoglycan binding motif, wherein the 
polypeptide is associated with the peptidoglycan layer, a Streptococcus pneumoniae 

15 polypeptide having a signal sequence and a C-terminal Tyrosine or Phenylalanine 
amino acid, a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
sequence, a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed and a Streptococcus pneumoniae polypeptide identified by 
proteomics as membrane associated. 

20 In other embodiments, the isolated polynucleotide is a complement to a 

Streptococcus pneumoniae genomic sequence comprising a nucleotide sequence 
chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO:431 
through SEQ ID NO:591, a fragment thereof, or a degenerate variant thereof, and 
encodes a polypeptide, a biological equivalent thereof, or a fragment thereof, 

25 selected from the group consisting of a Streptococcus pneumoniae polypeptide 
having 0, 1 or 2 transmembrane domains, a Streptococcus pneumoniae polypeptide 
having 3 or more transmembrane domains, a Streptococcus pneumoniae polypeptide 
having an outer membrane domain or a periplasmic domain, a Streptococcus 
pneumoniae polypeptide having an inner membrane domain, a Streptococcus 

30 pneumoniae polypeptide identified by Blastp analysis, a Streptococcus pneumoniae 
polypeptide identified by Pfam analysis, a Streptococcus pneumoniae lipoprotein, a 
Streptococcus pneumoniae polypeptide having a LPXTG motif, wherein the 
, polypeptide is covalently attached to the peptidoglycan layer, a Streptococcus 
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pneumoniae polypeptide having a peptidoglycan binding motif, wherein the 
polypeptide is associated with the peptidoglycan layer, a Streptococcus pneumoniae 
polypeptide having a signal sequence and a C-terminal Tyrosine or Phenylalanine 
amino acid, a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
5 sequence, a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed and a Streptococcus pneumoniae polypeptide identified by 
proteomics as membrane associated. In certain embodiments, the polynucleotide is 
selected from the group consisting of DNA, chromosomal DNA, cDNA and RNA and 
may further comprise heterologous nucleotides. In still other embodiments, the 

10 polynucleotide encodes a fusion polypeptide. 

In a preferred embodiment, a polynucleotide encoding a polypeptide having 0, 
1 or 2 transmembrane domains comprises a nucleotide sequence chosen from one 
of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 7, SEQ ID NO: 8, SEQ 
ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID 

15 NO: 18, SEQ ID NO: 19, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID 
NO: 25, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID 
NO: 36, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID 
NO: 47, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID 
NO: 55, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID 

20 NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 67, SEQ ID 
NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 72, SEQ ID NO: 73, SEQ ID 
NO: 74, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID 
NO: 85, SEQ ID NO: 86, SEQ ID NO: 89, SEQ ID NO: 91, SEQ ID NO: 92, SEQ ID 
NO: 95, SEQ ID NO: 96, SEQ ID NO: 97, SEQ ID NO: 100, SEQ ID NO: 104, SEQ 

25 ID NO: 105, SEQ ID NO: 106, SEQ ID NO: 109, SEQ ID NO: 110, SEQ ID NO: 111, 
SEQ ID NO: 113, SEQ ID NO: 116, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 
123, SEQ ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID 
NO: 131, SEQ ID NO: 132, SEQ ID NO: 134, SEQ ID NO: 136, SEQ ID NO: 137, 
SEQ ID NO: 138, SEQ ID NO: 141, SEQ ID NO: 142, SEQ ID NO: 143, SEQ ID NO: 

30 144, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, SEQ ID 
NO: 155, SEQ ID NO: 156, SEQ ID NO: 158, SEQ ID NO: 161, SEQ ID NO: 162, 
SEQ ID NO: 165, SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, SEQ ID NO: 
174, SEQ ID NO: 176, SEQ ID NO: 179, SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID 
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NO: 187, SEQ ID NO: 192, SEQ ID NO: 195, SEQ ID NO: 196, SEQ ID NO: 197, 
SEQ ID NO: 199, SEQ ID NO: 200, SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 
204, SEQ ID NO: 205, SEQ ID NO: 207, SEQ ID NO: 209 and SEQ ID NO: 210. 

In another preferred embodiment, a polynucleotide encoding a polypeptide 
5 having 3 or more transmembrane domains comprises a nucleotide sequence chosen 
from one of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 10, SEQ ID 
NO: 12, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID 
NO: 26, SEQ ID NO: 27, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID 
NO: 35, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 43, SEQ ID 

10 NO: 44, SEQ ID NO: 46, SEQ ID NO: 48, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID 
NO: 56, SEQ ID NO: 59, SEQ ID NO: 65, SEQ ID NO: 71, SEQ ID NO: 75, SEQ ID 
NO: 76, SEQ ID NO: 77, SEQ ID NO: 80, SEQ ID NO: 82, SEQ ID NO: 84, SEQ ID 
NO: 87, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID 
NO: 98, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ 

15 ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 112, SEQ ID NO: 114, SEQ ID NO: 115, 
SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 
124, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 133, SEQ ID NO: 135, SEQ ID 
NO: 139, SEQ ID NO: 140, SEQ ID NO: 145, SEQ ID NO: 146, SEQ ID NO: 151, 
SEQ ID NO: 152, SEQ ID NO: 153, SEQ ID NO: 154, SEQ ID NO: 157, SEQ ID NO: 

20 159, SEQ ID NO: 160, SEQ ID NO: 163, SEQ ID NO: 164, SEQ ID NO: 166, SEQ ID 
NO: 167, SEQ ID NO: 168, SEQ ID NO: 169, SEQ ID NO: 173, SEQ ID NO: 175, 
SEQ ID NO: 177, SEQ ID NO: 178, SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 
182, SEQ ID NO: 184, SEQ ID NO: 186, SEQ ID NO: 188, SEQ ID NO: 189, SEQ ID 
NO: 190, SEQ ID NO: 191, SEQ ID NO: 193, SEQ ID NO: 194, SEQ ID NO: 198, 

25 SEQ ID NO: 203, SEQ ID NO: 206, SEQ ID NO: 208, SEQ ID NO: 21 1 , SEQ ID NO: 
212, SEQ ID NO: 213, SEQ ID NO: 214 and SEQ ID NO: 215. 

In other preferred embodiments, a polynucleotide encoding a polypeptide 
having an outer membrane domain or a periplasmic domain comprises a nucleotide 
sequence chosen from one of SEQ ID NO: 3, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID 

30 NO: 23, SEQ ID NO: 39, SEQ ID NO: 50, SEQ ID NO: 62, SEQ ID NO: 67, SEQ ID 
NO: 78, SEQ ID NO: 85, SEQ ID NO: 125, SEQ ID NO: 134, SEQ ID NO: 147, SEQ 
ID NO: 165, SEQ ID NO: 172 and SEQ ID NO: 179. 
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In other preferred embodiments, a polynucleotide encoding a polypeptide 
having an inner membrane domain comprises a nucleotide sequence chosen from 
one of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 10, 
SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, 
5 SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, 
SEQ ID NO: 22, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, 
SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, 
SEQ ID NO: 35, SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 40, 
SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, 

10 SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 56, 
SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID NO: 65, SEQ ID NO: 68, 
SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 73, SEQ ID NO: 75, 
SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 81, 
SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID NO: 84, SEQ ID NO: 86 SEQ ID NO: 87, 

15 SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 94, 
SEQ ID NO: 95, SEQ ID NO: 96, SEQ ID NO: 97, SEQ ID NO: 98, SEQ ID NO: 99, 
SEQ ID NO: 100, SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 
105, SEQ ID NO: 106, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID 
NO: 112, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 117, 

20 SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, SEQ ID NO: 
122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID 
NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, SEQ ID NO: 132, 
SEQ ID NO: 133, SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID NO: 139, SEQ ID NO: 
140, SEQ ID NO: 141, SEQ ID NO: 142, SEQ ID NO: 144, SEQ ID NO: 145, SEQ ID 

25 NO: 146, SEQ ID NO: 148, SEQ ID NO: 150, SEQ ID NO: 151, SEQ ID NO: 152, 
SEQ ID NO: 153, SEQ ID NO: 154, SEQ ID NO: 156, SEQ ID NO: 157, SEQ ID NO: 
158, SEQ ID NO: 159, SEQ ID NO: 160, SEQ ID NO: 162, SEQ ID NO: 163, SEQ ID 
NO: 164, SEQ ID NO: 166, SEQ ID NO: 167, SEQ ID NO: 168, SEQ ID NO: 169, 
SEQ ID NO: 170, SEQ ID NO: 173, SEQ ID NO: 175, SEQ ID NO: 176, SEQ ID NO: 

30 177, SEQ ID NO: 178, SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 182, SEQ ID 
NO: 184, SEQ ID NO: 186, SEQ ID NO: 187, SEQ ID NO: 188, SEQ ID NO: 189, 
SEQ ID NO: 190, SEQ ID NO: 191, SEQ ID NO: 192, SEQ ID NO: 193, SEQ ID NO: 
194, SEQ ID NO: 195, SEQ ID NO: 198, SEQ ID NO: 200, SEQ ID NO: 203, SEQ ID 



-8- 



WO 02/083855 



PCT/US02/11524 



NO: 206, SEQ ID NO: 208, SEQ ID NO: 209, SEQ ID NO: 211, SEQ ID NO: 212, 
SEQ ID NO: 213, SEQ ID NO: 214 and SEQ ID NO: 215. 

In yet another preferred embodiment, a polynucleotide encoding a 
polypeptide identified by Blastp analysis comprises a nucleotide sequence chosen 
5 from one of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 7, SEQ ID NO: 10, SEQ ID 
NO: 12, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 24, SEQ ID NO: 27, SEQ ID 
NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID 
NO: 35, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID 
NO: 44, SEQ ID NO: 48, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID NO: 59, SEQ ID 

10 NO: 60, SEQ ID NO: 61, SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID 
NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID 
NO: 77, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 87, SEQ ID 
NO: 88, SEQ ID NO: 90, SEQ ID NO: 94, SEQ ID NO: 95, SEQ ID NO: 96, SEQ ID 
NO: 98, SEQ ID NO: 100, SEQ ID NO: 103, SEQ ID NO: 105, SEQ ID NO: 107, SEQ 

15 ID NO: 108, SEQ ID NO: 109, SEQ ID NO: 112, SEQ ID NO: 113, SEQ ID NO: 115, 
SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 
124, SEQ ID NO: 127, SEQ ID NO: 129, SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID 
NO: 133, SEQ ID NO: 134, SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID NO: 138, 
SEQ ID NO: 139, SEQ ID NO: 141, SEQ ID NO: 144, SEQ ID NO: 146, SEQ ID NO: 

20 147, SEQ ID NO: 151 , SEQ ID NO: 152, SEQ ID NO: 154, SEQ ID NO: 155, SEQ ID 
NO: 157, SEQ ID NO: 158, SEQ ID NO: 159, SEQ ID NO: 160, SEQ ID NO: 161, 
SEQ ID NO: 162, SEQ ID NO: 163, SEQ ID NO: 165, SEQ ID NO: 166, SEQ ID NO: 
167, SEQ ID NO: 169, SEQ ID NO: 172, SEQ ID NO: 173, SEQ ID NO: 176, SEQ ID 
NO: 177, SEQ ID NO: 178, SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 182, 

25 SEQ ID NO: 184, SEQ ID NO: 185, SEQ ID NO: 186, SEQ ID NO: 188, SEQ ID NO: 
189, SEQ ID NO: 191, SEQ ID NO: 193, SEQ ID NO: 196, SEQ ID NO: 197, SEQ ID 
NO: 198, SEQ ID NO: 199, SEQ ID NO: 200, SEQ ID NO: 201, SEQ ID NO: 202, 
SEQ ID NO: 204, SEQ ID NO: 205, SEQ ID NO: 206, SEQ ID NO: 207, SEQ ID NO: 
208, SEQ ID NO: 210, SEQ ID NO: 212, SEQ ID NO: 213 and SEQ ID NO: 214. 

30 In still further preferred embodiments, a polynucleotide encoding a 

polypeptide identified by Pfam analysis comprises a nucleotide sequence chosen 
from one of SEQ ID NO: 4, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 41, SEQ ID 
NO: 45, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 63, SEQ ID 
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NO: 64, SEQ ID NO: 66, SEQ ID NO: 72, SEQ ID NO: 74, SEQ ID NO: 89, SEQ ID 
NO: 92, SEQ ID NO: 104, SEQ ID NO: 111, SEQ ID NO: 116, SEQ ID NO: 119, SEQ 
ID NO: 128, SEQ ID NO: 137, SEQ ID NO: 142, SEQ ID NO: 143, SEQ ID NO: 149, 
SEQ ID NO: 151, SEQ ID NO: 152, SEQ ID NO: 153, SEQ ID NO: 157, SEQ ID NO: 
5 159, SEQ ID NO: 160, SEQ ID NO: 162, SEQ ID NO: 163, SEQ ID NO: 164, SEQ ID 
NO: 165, SEQ ID NO: 166, SEQ ID NO: 169, SEQ ID NO: 171, SEQ ID NO: 174, 
SEQ ID NO: 176, SEQ ID NO: 180, SEQ ID NO: 182, SEQ ID NO: 183, SEQ ID NO: 
184, SEQ ID NO: 186, SEQ ID NO: 188, SEQ ID NO 189, SEQ ID NO: 195, SEQ ID 
NO: 198, SEQ ID NO 199, SEQ ID NO: 205, SEQ ID NO: 212 and SEQ ID NO: 213. 

10 In another preferred embodiment, a polynucleotide encoding a lipoprotein 

comprises a nucleotide sequence chosen from one of SEQ ID NO: 3, SEQ ID NO: 8, 
SEQ ID NO: 9, SEQ ID NO: 13, SEQ ID NO: 21, SEQ ID NO: 26, SEQ ID NO; 34, 
SEQ ID NO: 62, SEQ ID NO: 67, SEQ ID NO: 85, SEQ ID NO: 134, SEQ ID NO: 
147, SEQ ID NO: 150, SEQ ID NO: 168, SEQ ID NO: 170 and SEQ ID NO: 173. 

15 In other preferred embodiments, a polynucleotide encoding a polypeptide 

having a LPXTG motif and is covalently attached to the peptidoglycan layer 
comprises a nucleotide sequence chosen from one of SEQ ID NO: 13, SEQ ID NO: 
21, SEQ ID NO: 34 and SEQ ID NO: 170; or a polynucleotide encoding a polypeptide 
having a peptidoglycan binding motif and associated with the peptidoglycan layer 

20 comprises a nucleotide sequence chosen from one of SEQ ID NO: 25, SEQ ID NO: 
49 and SEQ ID NO: 110. 

In another preferred embodiment, a polynucleotide encoding a polypeptide 
having a signal sequence and a C-terminal Tyrosine or Phenylalanine amino acid 
comprises a nucleotide sequence chosen from one of SEQ ID NO:11, SEQ ID 

25 NO:39, SEQ ID NO:73, SEQ ID NO:97, SEQ ID NO:106, SEQ ID NO: 125 and SEQ 
ID NO:187. 

In yet another preferred embodiment, a polynucleotide encoding a 
polypeptide having a tripeptide RGD sequence that potentially is involved in cell 
attachment comprises a nucleotide sequence chosen from one of SEQ ID NO:1, 
30 SEQ ID NO:21 , SEQ ID NO:66 and SEQ ID NO:67. 

In another preferred embodiment, a polynucleotide encoding a polypeptide 
identified by proteomics as surface exposed comprises a nucleotide sequence 
chosen from one of SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:46, 
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SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:71, SEQ 
ID NO:74, SEQ ID N0:91, SEQ ID NO:103, SEQ ID N0:116, SEQ ID NO:128, SEQ 
ID N0:131, SEQ ID NO:136, SEQ ID N0:151, SEQ ID NO:156, SEQ ID NO:159, 
SEQ ID NO:162, SEQ ID NO:164, SEQ ID N0.172, SEQ ID NO:176, SEQ ID 
5 NO:178, SEQ ID NO:179, SEQ ID NO:180, SEQ ID NO:182 and SEQ ID NO:205. 

In still another embodiment, a polynucleotide encoding a polypeptide 
identified by proteomics as membrane associated comprises a nucleotide sequence 
chosen from one of SEQ ID NO:431 through SEQ ID NO:591. 

In certain aspects, the invention relates to Streptococcus pneumoniae 

10 polypeptides. More particularly, the invention relates to Streptococcus pneumoniae 
polypeptides, more preferably antigenic polypeptides, encoded by Streptococcus 
pneumoniae polynucleotide open reading frames. Thus, in certain embodiments, an 
isolated polypeptide is encoded by a polynucleotide comprising a nucleotide 
sequence having at least about 95% identity to a nucleotide sequence chosen from 

15 one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID 
NO: 591, a degenerate variant thereof, or a fragment thereof. In a preferred 
embodiment, the isolated polypeptide encoded by one of the above polynucleotides 
comprises an amino acid sequence having at least about 95% identity to an amino 
acid sequence chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or 

20 SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or a 
fragment thereof. In other embodiments, the polypeptide is a fusion polypeptide. In 
a preferred embodiment, the polypeptide immunoreacts with seropositive serum of 
an individual infected with Streptococcus pneumoniae. 

In preferred embodiments, the isolated polypeptide encoded by a 

25 polynucleotide comprising a nucleotide sequence having at least about 95% identity 
to a nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 
215 or SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, or a 
fragment thereof, is further defined as a Streptococcus pneumoniae polypeptide 
having 0, 1 or 2 transmembrane domains, a Streptococcus pneumoniae polypeptide 

30 having 3 or more transmembrane domains, a Streptococcus pneumoniae polypeptide 
having an outer membrane domain or a periplasmic domain, a Streptococcus 
pneumoniae polypeptide having an inner membrane domain, a Streptococcus 
pneumoniae polypeptide identified by Blastp analysis, a Streptococcus pneumoniae 
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polypeptide identified by Pfam analysis, a Streptococcus pneumoniae lipoprotein, a 
Streptococcus pneumoniae polypeptide having a LPXTG motif, wherein the 
polypeptide is covalently attached to the peptidoglycan layer, a Streptococcus 
pneumoniae polypeptide having a peptidoglycan binding motif, wherein the 
5 polypeptide is associated with the peptidoglycan layer, a Streptococcus pneumoniae 
polypeptide having a signal sequence and a C-terminal Tyrosine or Phenylalanine 
amino acid, a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
sequence, a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed or a Streptococcus pneumoniae polypeptide identified by 

10 proteomics as membrane associated, where each of these groups has the set of 
ORFs identified above as within SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 
NO: 431 through SEQ ID NO: 591. 

In a particularly preferred embodiment, an isolated polypeptide comprises an 
amino acid sequence having at least about 95% identity to an amino acid sequence 

15 chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 
through SEQ ID NO: 752, a biological equivalent thereof, or a fragment thereof. In 
another embodiment, the polypeptide is a fusion polypeptide. In a particularly 
preferred embodiment, the polypeptide immunoreacts with seropositive serum of an 
individual infected with Streptococcus pneumoniae. In yet other preferred 

20 embodiments, the polypeptide is further defined as a Streptococcus pneumoniae 
polypeptide having 0, 1 or 2 transmembrane domains, a Streptococcus pneumoniae 
polypeptide having 3 or more transmembrane domains, a Streptococcus pneumoniae 
polypeptide having an outer membrane domain or a periplasmic domain, a 
Streptococcus pneumoniae polypeptide having an inner membrane domain, a 

25 Streptococcus pneumoniae polypeptide identified by Blastp analysis, a 
Streptococcus pneumoniae polypeptide identified by Pfam analysis, a Streptococcus 
pneumoniae lipoprotein, a Streptococcus pneumoniae polypeptide having a LPXTG 
motif, wherein the polypeptide is covalently attached to the peptidoglycan layer, a 
Streptococcus pneumoniae polypeptide having a peptidoglycan binding motif, 

30 wherein the polypeptide is associated with the peptidoglycan layer, a Streptococcus 
pneumoniae polypeptide having a signal sequence and a C-terminal Tyrosine or 
Phenylalanine amino acid, a Streptococcus pneumoniae polypeptide having a 
tripeptide RGD sequence, a Streptococcus pneumoniae polypeptide identified by 
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proteomics as surface exposed or a Streptococcus pneumoniae polypeptide 
identified by proteomics as membrane associated. 

In a preferred embodiment, a polypeptide having 0, 1 or 2 transmembrane 
domains comprises an amino acid sequence chosen from one of SEQ ID NO: 216, 
5 SEQ ID NO: 218, SEQ ID NO: 219, SEQ ID NO: 222, SEQ ID NO: 223, SEQ ID NO: 
224, SEQ ID NO: 226, SEQ ID NO: 228, SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID 
NO: 233, SEQ ID NO: 234, SEQ ID NO: 237, SEQ ID NO: 238, SEQ ID NO: 239, 
SEQ ID NO: 240, SEQ ID NO: 243, SEQ ID NO: 244, SEQ ID NO: 247, SEQ ID NO: 
249, SEQ ID NO: 251, SEQ ID NO: 254, SEQ ID NO: 256, SEQ ID NO: 257, SEQ ID 

10 NO: 260, SEQ ID NO: 262, SEQ ID NO: 264, SEQ ID NO: 265, SEQ ID NO: 266, 
SEQ ID NO: 268, SEQ ID NO: 270, SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 
275, SEQ ID NO: 276, SEQ ID NO: 277, SEQ ID NO: 278, SEQ ID NO: 279, SEQ ID 
NO: 281, SEQ ID NO: 282, SEQ ID NO: 283, SEQ ID NO: 284, SEQ ID NO: 285, 
SEQ ID NO: 286, SEQ ID NO: 287, SEQ ID NO: 289, SEQ ID NO: 293, SEQ ID NO: 

15 294, SEQ ID NO: 296, SEQ ID NO: 298, SEQ ID NO: 300, SEQ ID NO: 301, SEQ ID 
NO: 304, SEQ ID NO: 306, SEQ ID NO: 307, SEQ ID NO: 310, SEQ ID NO: 311, 
SEQ ID NO: 312, SEQ ID NO: 315, SEQ ID NO: 319, SEQ ID NO: 320, SEQ ID NO: 
321, SEQ ID NO: 324, SEQ ID NO: 325, SEQ ID NO: 326, SEQ ID NO: 328, SEQ ID 
NO: 331, SEQ ID NO: 336, SEQ ID NO: 337, SEQ ID NO: 338, SEQ ID NO: 340, 

20 SEQ ID NO: 341, SEQ ID NO: 342, SEQ ID NO: 343, SEQ ID NO: 346, SEQ ID NO: 
347, SEQ ID NO: 349, SEQ ID NO: 351, SEQ ID NO: 352, SEQ ID NO: 353, SEQ ID 
NO: 356, SEQ ID NO: 357, SEQ ID NO: 358, SEQ ID NO: 359, SEQ ID NO: 362, 
SEQ ID NO: 363, SEQ ID NO: 364, SEQ ID NO: 365, SEQ ID NO: 370, SEQ ID NO: 
371, SEQ ID NO: 373, SEQ ID NO: 376, SEQ ID NO: 377, SEQ ID NO: 380, SEQ ID 

25 NO: 385, SEQ ID NO: 386, SEQ ID NO: 387, SEQ ID NO: 389, SEQ ID NO: 391, 
SEQ ID NO: 394, SEQ ID NO: 398, SEQ ID NO: 400, SEQ ID NO: 402, SEQ ID NO: 
407, SEQ ID NO: 410, SEQ ID NO: 411, SEQ ID NO: 412, SEQ ID NO: 414, SEQ ID 
NO: 415, SEQ ID NO: 416, SEQ ID NO: 417, SEQ ID NO: 419, SEQ ID NO: 420, 
SEQ ID NO: 422, SEQ ID NO: 424, SEQ ID NO: 425, a biological equivalent thereof, 

30 or a fragment thereof. 

In another preferred embodiment, a polypeptide having 3 or more 
transmembrane domains comprises an amino acid sequence chosen from one of 
SEQ ID NO: 217, SEQ ID NO: 220, SEQ ID NO: 221, SEQ ID NO: 225, SEQ ID NO: 
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227, SEQ ID NO: 229, SEQ ID NO: 230, SEQ ID NO: 235, SEQ ID NO: 236, SEQ ID 
NO: 241, SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID NO: 246, SEQ ID NO: 248, 
SEQ ID NO: 250, SEQ ID NO: 252, SEQ ID NO: 253, SEQ ID NO: 255, SEQ ID NO: 
258, SEQ ID NO: 259, SEQ ID NO: 261, SEQ ID NO: 263, SEQ ID NO: 267, SEQ ID 
5 NO: 269, SEQ ID NO: 271, SEQ ID NO: 274, SEQ ID NO: 280, SEQ ID NO: 286, 
SEQ ID NO: 290, SEQ ID NO: 291, SEQ ID NO: 292, SEQ ID NO: 295, SEQ ID NO: 
297, SEQ ID NO: 299, SEQ ID NO: 302, SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID 
NO: 308, SEQ ID NO: 309, SEQ ID NO: 313, SEQ ID NO: 314, SEQ ID NO: 316, 
SEQ ID NO: 317, SEQ ID NO: 318, SEQ ID NO: 322, SEQ ID NO: 323, SEQ ID NO: 

10 327, SEQ ID NO: 329, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID NO: 333, SEQ ID 
NO: 334, SEQ ID NO: 335, SEQ ID NO: 339, SEQ ID NO: 344, SEQ ID NO: 345, 
SEQ ID NO: 348, SEQ ID NO: 350, SEQ ID NO: 354, SEQ ID NO: 355, SEQ ID NO: 
360, SEQ ID NO: 361, SEQ ID NO: 366, SEQ ID NO: 367, SEQ ID NO: 368, SEQ ID 
NO: 369, SEQ ID NO: 372, SEQ ID NO: 374, SEQ ID NO: 375, SEQ ID NO: 378, 

15 SEQ ID NO: 379, SEQ ID NO: 381, SEQ ID NO: 382, SEQ ID NO: 383, SEQ ID NO: 
384, SEQ ID NO: 388, SEQ ID NO: 390, SEQ ID NO: 392, SEQ ID NO: 393, SEQ ID 
NO: 395, SEQ ID NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, SEQ ID NO: 401, 
SEQ ID NO: 403, SEQ ID NO: 404, SEQ ID NO: 405, SEQ ID NO: 406, SEQ ID NO: 
408, SEQ ID NO: 409, SEQ ID NO: 413, SEQ ID NO: 418, SEQ ID NO: 421, SEQ ID 

20 NO: 423, SEQ ID NO: 426, SEQ ID NO: 427, SEQ ID NO: 428, SEQ ID NO: 429, 
SEQ ID NO: 430, a biological equivalent thereof, or a fragment thereof. 

In yet other preferred embodiments, a polypeptide having an outer membrane 
domain or a periplasmic domain comprises an amino acid sequence chosen from 
one of SEQ ID NO: 218, SEQ ID NO: 223, SEQ ID NO: 224, SEQ ID NO: 238, SEQ 

25 ID NO: 254, SEQ ID NO: 265, SEQ ID NO: 277, SEQ ID NO: 282, SEQ ID NO: 293, 
SEQ ID NO: 300, SEQ ID NO: 340, SEQ ID NO: 349, SEQ ID NO: 362, SEQ ID NO: 
380, SEQ ID NO: 387, SEQ ID NO: 394, a biological equivalent thereof, or a 
fragment thereof. 

In yet other preferred embodiments, a polynucleotide encoding a polypeptide 
30 having an inner membrane domain comprises an amino acid sequence chosen from 
one of SEQ ID NO: 217, SEQ ID NO: 220, SEQ ID NO: 221, SEQ ID NO: 222, SEQ 
ID NO: 225, SEQ ID NO: 226, SEQ ID NO: 227, SEQ ID NO: 228, SEQ ID NO: 229, 
SEQ ID NO: 230, SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 234, SEQ ID NO: 
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235, SEQ ID NO: 236, SEQ ID NO: 237, SEQ ID NO: 241, SEQ ID NO: 242, SEQ ID 
NO: 243, SEQ ID NO: 244, SEQ ID NO: 245, SEQ ID NO: 246, SEQ ID NO: 247, 
SEQ ID NO: 248, SEQ ID NO: 249, SEQ ID NO: 250, SEQ ID NO: 251, SEQ ID NO: 
252, SEQ ID NO: 253, SEQ ID NO: 255, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID 
5 NO: 261, SEQ ID NO: 262, SEQ ID NO: 263, SEQ ID NO: 266, SEQ ID NO: 267, 
SEQ ID NO: 268, SEQ ID NO: 269, SEQ ID NO: 271, SEQ ID NO: 274, SEQ ID NO: 
275, SEQ ID NO: 276, SEQ ID NO: 280, SEQ ID NO: 283, SEQ ID NO: 284, SEQ ID 
NO: 285, SEQ ID NO: 286, SEQ ID NO: 288, SEQ ID NO: 290, SEQ ID NO: 291, 
SEQ ID NO: 292, SEQ ID NO: 294, SEQ ID NO: 295, SEQ ID NO: 296, SEQ ID NO: 

10 297, SEQ ID NO: 298, SEQ ID NO: 299, SEQ ID NO: 301 SEQ ID NO: 302, SEQ ID 
NO: 303, SEQ ID NO: 305, SEQ ID NO: 306, SEQ ID NO: 308, SEQ ID NO: 309, 
SEQ ID NO: 310, SEQ ID NO: 311, SEQ ID NO: 312, SEQ ID NO: 313, SEQ ID NO: 
314, SEQ ID NO: 315, SEQ ID NO: 316, SEQ ID NO: 317, SEQ ID NO: 318, SEQ ID 
NO: 320, SEQ ID NO: 321, SEQ ID NO: 322, SEQ ID NO: 323, SEQ ID NO: 324, 

15 SEQ ID NO: 327, SEQ ID NO: 328, SEQ ID NO: 329, SEQ ID NO: 330, SEQ ID NO: 
332, SEQ ID NO: 333, SEQ ID NO: 334, SEQ ID NO: 335, SEQ ID NO: 336, SEQ ID 
NO: 337, SEQ ID NO: 338, SEQ ID NO: 339, SEQ ID NO: 341, SEQ ID NO: 342, 
SEQ ID NO: 343, SEQ ID NO: 344, SEQ ID NO: 345, SEQ ID NO: 346, SEQ ID NO: 
347, SEQ ID NO: 348, SEQ ID NO: 350, SEQ ID NO: 351, SEQ ID NO: 354, SEQ ID 

20 NO: 355, SEQ ID NO: 356, SEQ ID NO: 357, SEQ ID NO: 359, SEQ ID NO: 360, 
SEQ ID NO: 361, SEQ ID NO: 362, SEQ ID NO: 365, SEQ ID NO: 366, SEQ ID NO: 
367, SEQ ID NO: 368, SEQ ID NO: 369, SEQ ID NO: 371, SEQ ID NO: 372, SEQ ID 
NO: 373, SEQ ID NO: 374, SEQ ID NO: 375, SEQ ID NO: 377, SEQ ID NO: 378, 
SEQ ID NO: 379, SEQ ID NO: 381, SEQ ID NO: 382, SEQ ID NO: 383, SEQ ID NO: 

25 384, SEQ ID NO: 385, SEQ ID NO: 388, SEQ ID NO: 390, SEQ ID NO: 391, SEQ ID 
NO: 392, SEQ ID NO: 393, SEQ ID NO: 395, SEQ ID NO: 396, SEQ ID NO: 397, 
SEQ ID NO: 399, SEQ ID NO: 401, SEQ ID NO: 402, SEQ ID NO: 403, SEQ ID NO: 
404, SEQ ID NO: 405, SEQ ID NO: 406, SEQ ID NO: 407, SEQ ID NO: 408, SEQ ID 
NO: 409, SEQ ID NO: 410, SEQ ID NO: 413, SEQ ID NO: 415, SEQ ID NO: 418, 

30 SEQ ID NO: 421 , SEQ ID NO: 423, SEQ ID NO: 424, SEQ ID NO: 426, SEQ ID NO: 
427, SEQ ID NO: 428, SEQ ID NO: 429, SEQ ID NO: 430, a biological equivalent 
thereof, or a fragment thereof. 
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In still another preferred embodiment, a polypeptide identified by Blastp 
analysis comprises an amino acid sequence chosen from one of SEQ ID NO: 216, 
SEQ ID NO: 217, SEQ ID NO: 222, SEQ ID NO: 225, SEQ ID NO: 227, SEQ ID NO: 
231, SEQ ID NO: 235, SEQ ID NO: 239, SEQ ID NO: 242, SEQ ID NO: 245, SEQ ID 
5 NO: 246, SEQ ID NO: 247, SEQ ID NO: 248, SEQ ID NO: 249, SEQ ID NO: 250, 
SEQ ID NO: 253, SEQ ID NO: 255, SEQ ID NO: 257, SEQ ID NO: 258, SEQ ID NO: 
259, SEQ ID NO: 263, SEQ ID NO: 266, SEQ ID NO: 268, SEQ ID NO: 269, SEQ ID 
NO: 275, SEQ ID NO: 276, SEQ ID NO: 280, SEQ ID NO: 282, SEQ ID NO: 283, 
SEQ ID NO: 284, SEQ ID NO: 285, SEQ ID NO: 286, SEQ ID NO: 290, SEQ ID NO: 

10 291 , SEQ ID NO: 292, SEQ ID NO: 293, SEQ ID NO: 294, SEQ ID NO: 295, SEQ ID 
NO: 302, SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 309, SEQ ID NO: 310, 
SEQ ID NO: 311, SEQ ID NO: 313, SEQ ID NO: 315, SEQ ID NO: 318, SEQ ID NO: 
320, SEQ ID NO: 322, SEQ ID NO: 323, SEQ ID NO: 324, SEQ ID NO: 327, SEQ ID 
NO: 328, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID NO: 333, SEQ ID NO: 337, 

15 SEQ ID NO: 338, SEQ ID NO: 339, SEQ ID NO: 342, SEQ ID NO: 344, SEQ ID NO: 
346, SEQ ID NO: 347, SEQ ID NO: 348, SEQ ID NO: 349, SEQ ID NO: 350, SEQ ID 
NO: 351, SEQ ID NO: 353, SEQ ID NO: 354, SEQ ID NO: 356, SEQ ID NO: 359, 
SEQ ID NO: 361, SEQ ID NO: 362, SEQ ID NO: 366 , SEQ ID NO: 367, SEQ ID NO: 
369, SEQ ID NO: 370, SEQ ID NO: 372, SEQ ID NO: 373, SEQ ID NO: 374, SEQ ID 

20 NO: 375, SEQ ID NO: 376, SEQ ID NO: 377, SEQ ID NO: 378, SEQ ID NO: 380, 
SEQ ID NO: 381, SEQ ID NO: 382, SEQ ID NO: 384, SEQ ID NO: 387, SEQ ID NO: 
388, SEQ ID NO: 391, SEQ ID NO: 392, SEQ ID NO: 393, SEQ ID NO: 395, SEQ ID 
NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, SEQ ID NO: 400, SEQ ID NO: 401, 
SEQ ID NO: 403, SEQ ID NO: 404, SEQ ID NO: 406, SEQ ID NO: 408, SEQ ID NO: 

25 411, SEQ ID NO: 412, SEQ ID NO: 413, SEQ ID NO: 414, SEQ ID NO: 415, SEQ ID 
NO: 416, SEQ ID NO: 417, SEQ ID NO: 419, SEQ ID NO: 420, SEQ ID NO: 421, 
SEQ ID NO: 422, SEQ ID NO: 423, SEQ ID NO: 425, SEQ ID NO: 427, SEQ ID NO: 
428, SEQ ID NO: 429, a biological equivalent thereof, or a fragment thereof. 

In other preferred embodiments, a polypeptide identified by Pfam analysis 

30 comprises an amino acid sequence chosen from one of SEQ ID NO: 219, SEQ ID 
NO: 233, SEQ ID NO: 234, SEQ ID NO: 255, SEQ ID NO: 260, SEQ ID NO: 270, 
SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 278, SEQ ID NO: 279, SEQ ID NO: 
281, SEQ ID NO: 287, SEQ ID NO: 289, SEQ ID NO: 304, SEQ ID NO: 307, SEQ ID 
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NO: 319, SEQ ID NO: 326, SEQ ID NO: 331, SEQ ID NO: 334, SEQ ID NO: 343, 
SEQ ID NO: 352, SEQ ID NO: 357, SEQ ID NO: 358, SEQ ID NO: 364, SEQ ID NO: 
366, SEQ ID NO: 367, SEQ ID NO: 368, SEQ ID NO: 372, SEQ ID NO: 374, SEQ ID 
NO: 375, SEQ ID NO: 377, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 380, 
5 SEQ ID NO: 381, SEQ ID NO: 384, SEQ ID NO: 386, SEQ ID NO: 389, SEQ ID NO: 
391, SEQ ID NO: 395, SEQ ID NO: 397, SEQ ID NO: 398, SEQ ID NO: 399, SEQ ID 
NO: 401, SEQ ID NO: 403, SEQ ID NO 404, SEQ ID NO: 410, SEQ ID NO: 413, 
SEQ ID NO 414, SEQ ID NO: 420, SEQ ID NO: 427, SEQ ID NO: 428, a biological 
equivalent thereof, or a fragment thereof. 

10 In one preferred embodiment, a polypeptide is a lipoprotein and comprises an 

amino acid sequence chosen from one of SEQ ID NO: 218, SEQ ID NO: 223, SEQ 
ID NO: 224, SEQ ID NO: 228, SEQ ID NO: 236, SEQ ID NO: 241, SEQ ID NO: 249, 
SEQ ID NO: 277, SEQ ID NO: 282, SEQ ID NO: 300, SEQ ID NO: 349, SEQ ID NO: 
362, SEQ ID NO: 365, SEQ ID NO: 383, SEQ ID NO: 385, SEQ ID NO: 388, a 

1 5 biological equivalent thereof, or a fragment thereof. 

In certain other preferred embodiments, a polypeptide having a LPXTG motif 
and covalently attached to the peptidoglycan layer, comprises an amino acid 
sequence chosen from one of SEQ ID NO: 228, SEQ ID NO: 236, SEQ ID NO: 249, 
SEQ, SEQ ID NO: 385, a biological equivalent thereof, or a fragment thereof; or a 

20 polypeptide having a peptidoglycan binding motif and associated with the 
peptidoglycan layer comprises an amino acid sequence chosen from one of SEQ ID 
NO: 240, SEQ ID NO: 264, SEQ ID NO: 325, a biological equivalent thereof, or a 
fragment thereof. 

In another preferred embodiment, a polypeptide having a signal sequence 
25 and a C-terminal Tyrosine or Phenylalanine amino acid comprises an amino acid 
sequence chosen from one of SEQ ID NO:226, SEQ ID NO:254, SEQ ID NO:289, 
SEQ ID NO:312, SEQ ID NO:321, SEQ ID NO: 340, SEQ ID NO:402, a biological 
equivalent thereof, or a fragment thereof. 

In yet another preferred embodiment, a polypeptide having a tripeptide RGD 
30 sequence that potentially is involved in cell attachment comprises an amino acid 
sequence chosen from one of SEQ ID NO:216, SEQ ID NO:236, SEQ ID N0.281, 
SEQ ID NO:282, a biological equivalent thereof, or a fragment thereof. 
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In still another embodiment, a polypeptide identified by proteomics as surface 
exposed comprises an amino acid sequence chosen from one of SEQ ID NO: 229, 
SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 261, SEQ ID NO: 279, SEQ ID NO: 
281, SEQ ID NO: 282, SEQ ID NO: 284, SEQ ID NO: 286, SEQ ID NO: 289, SEQ ID 
5 NO: 306, SEQ ID NO: 318, SEQ ID NO: 331, SEQ ID NO: 343, SEQ ID NO: 346, 
SEQ ID NO: 351, SEQ ID NO: 366, SEQ ID NO: 371, SEQ ID NO: 374, SEQ ID NO: 
377, SEQ ID NO: 379, SEQ ID NO: 387, SEQ ID NO: 391, SEQ ID NO: 393, SEQ ID 
NO: 394, SEQ ID NO: 395, SEQ ID NO: 397, SEQ ID NO: 420, a biological 
equivalent thereof, or a fragment thereof. 
10 In yet another embodiment, a polypeptide identified by proteomics as 

membrane associated comprises an amino acid sequence chosen from one of SEQ 
ID NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or a fragment 
thereof. 

In another aspect of the invention, the polypeptides are expressed and 
15 purified in a recombinant expression system. Thus, in certain embodiments, the 
invention provides a recombinant expression vector comprising a nucleotide 
sequence having at least about 95% identity to a nucleotide sequence chosen from 
one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID 
NO: 591, a degenerate variant thereof, or a fragment thereof. In certain other 
20 embodiments, the polynucleotide is selected from the group consisting of DNA, 
chromosomal DNA, cDNA, RNA and antisense RNA. In another embodiment, the 
polynucleotide comprised within the vector further comprises heterologous nucleotide 
sequences. In other embodiments, the polynucleotide is operatively linked to one or 
more gene expression regulatory elements. In yet other embodiments, the 
25 polynucleotide encodes a polypeptide comprising an amino acid sequence having at 
least about 95% identity to an amino acid sequence chosen from one of SEQ ID NO: 
216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ ID NO: 752, a 
biological equivalent thereof, or a fragment thereof. In a preferred embodiment, the 
vector is a plasmid. 

30 In another aspect of the invention, there is provided a genetically engineered 

host cell, transfected, transformed or infected with a recombinant expression vector 
comprising a nucleotide sequence having at least about 95% identity to a nucleotide 
sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 
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NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, or a fragment 
thereof. In a preferred embodiment, the host cell is a bacterial cell. In a further 
embodiment, the polynucleotide is expressed under suitable conditions to produce 
the encoded polypeptide, a biological equivalent thereof, or a fragment thereof, which 
5 is then recovered. 

In other embodiments, the present invention provides an antibody specific for 
a Streptococcus pneumoniae polynucleotide chosen from one of SEQ ID NO: 1 
through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 591, a fragment 
thereof, a degenerate variant thereof, or an antibody specific for a Streptococcus 

10 pneumoniae polypeptide chosen from one of SEQ ID NO: 216 through SEQ ID NO: 
430 or SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or 
a fragment thereof. In certain embodiments, the antibody is selected from the group 
consisting of monoclonal, polyclonal, chimeric, humanized and single chain. In a 
preferred embodiment, the antibody is monoclonal. In another preferred 

1 5 embodiment, the antibody is humanized. 

The present invention further provides pharmaceutical compositions, in 
particular immunogenic compositions, for the prevention and/or treatment of bacterial 
infection. Thus, in one embodiment an immunogenic composition is provided 
comprising a polypeptide having an amino acid sequence chosen from one or more 

20 of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ ID 
NO: 752, a biological equivalent thereof, or a fragment thereof. In certain 
embodiments, the composition further comprises a pharmaceutical^ acceptable 
carrier. In yet other embodiments, the immunogenic composition further comprises 
one or more adjuvants. In a preferred embodiment, the polypeptide of the 

25 immunogenic composition is further defined as a Streptococcus pneumoniae 
polypeptide having 0, 1 or 2 transmembrane domains, a Streptococcus pneumoniae 
polypeptide having 3 or more transmembrane domains, a Streptococcus pneumoniae 
polypeptide having an outer membrane domain or a periplasmic domain, a 
Streptococcus pneumoniae polypeptide having an inner membrane domain, a 

30 Streptococcus pneumoniae polypeptide identified by Blastp analysis, a 
Streptococcus pneumoniae polypeptide identified by Pfam analysis, a Streptococcus 
pneumoniae lipoprotein, a Streptococcus pneumoniae polypeptide having a LPXTG 
motif, wherein the polypeptide is covalently attached to the peptidoglycan layer, a 
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Streptococcus pneumoniae polypeptide having a peptidoglycan binding motif, 
wherein the polypeptide is associated with the peptidoglycan layer, a Streptococcus 
pneumoniae polypeptide having a signal sequence and a C-terminal Tyrosine or 
Phenylalanine amino acid, a Streptococcus pneumoniae polypeptide having a 
5 tripeptide RGD sequence, a Streptococcus pneumoniae polypeptide identified by 
proteomics as surface exposed or a Streptococcus pneumoniae polypeptide 
identified by proteomics as membrane associated. In certain other embodiments, the 
immunogenic composition further comprises heterologous amino acids. In particular 
embodiments, the polypeptide is a fusion polypeptide. 

10 In further embodiments, provided is an immunogenic composition comprising 

a polynucleotide having a nucleotide sequence chosen from one or more of SEQ ID 
NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ IDNO: 591, a 
degenerate variant thereof, or a fragment thereof and is comprised in an expression 
vector. In preferred embodiments, the vector is plasmid DNA. In another 

15 embodiment, the polynucleotide comprises heterologous nucleotides. In still other 
embodiments, the polynucleotide is operatively linked to one or more gene 
expression regulatory elements. In yet other embodiments, the polynucleotide 
directs the expression of a neutralizing epitope of Streptococcus pneumoniae. In 
preferred embodiments, the immunogenic composition further comprises one or 

20 more adjuvants. 

Also provided is a pharmaceutical composition comprising a polypeptide and 
a pharmaceutically acceptable carrier, wherein the polypeptide comprises an amino 
acid chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 
592 through SEQ ID NO: 752, a biological equivalent thereof, or a fragment thereof. 

25 In preferred embodiments, the polypeptide is further defined as a Streptococcus 
pneumoniae polypeptide having 0, 1 or 2 transmembrane domains, a Streptococcus 
pneumoniae polypeptide having 3 or more transmembrane domains, a Streptococcus 
pneumoniae polypeptide having an outer membrane domain or a periplasmic 
domain, a Streptococcus pneumoniae polypeptide having an inner membrane 

30 domain, a Streptococcus pneumoniae polypeptide identified by Blastp analysis, a 
Streptococcus pneumoniae polypeptide identified by Pfam analysis, a Streptococcus 
pneumoniae lipoprotein, a Streptococcus pneumoniae polypeptide having a LPXTG 
motif, wherein the polypeptide is covalently attached to the peptidoglycan layer, a 
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Streptococcus pneumoniae polypeptide having a peptidoglycan binding motif, 
wherein the polypeptide is associated with the peptidoglycan layer, a Streptococcus 
pneumoniae polypeptide having a signal sequence and a C-terminal Tyrosine or 
Phenylalanine amino acid, a Streptococcus pneumoniae polypeptide having a 
5 tripeptide RGD sequence, a Streptococcus pneumoniae polypeptide identified by 
proteomics as surface exposed or a Streptococcus pneumoniae polypeptide 
identified by proteomics as membrane associated. In certain embodiments, the 
polypeptide further comprises heterologous amino acids. In still other embodiments, 
the polypeptide is a fusion polypeptide. 

10 In another embodiment, a method of immunizing against Streptococcus 

pneumoniae is provided comprising administering to a host an immunizing amount of 
an immunogenic composition comprising one or more polypeptides and a 
pharmaceutically acceptable carrier, wherein the polypeptide comprises an amino 
acid sequence chosen from one or more of SEQ ID NO: 216 through SEQ ID NO: 

15 430 or SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or 
a fragment thereof. In certain embodiments, the polypeptide is a fusion polypeptide. 
In other embodiments, the method further comprises administering an adjuvant. 

Other embodiments of the invention provide a DNA chip comprising an array 
of polynucleotides, wherein at least one of the polynucleotides comprise a nucleotide 

20 sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 
NO: 431 through SEQ ID NO: 591, a complement thereof, a degenerate variant 
thereof, or a fragment thereof. 

Also provided is a protein chip comprising an array of polypeptides, wherein 
at least one of the polypeptides comprises an amino acid sequence chosen from one 

25 of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ ID 
NO: 752, a biological equivalent thereof, or a fragment thereof. 

The invention further provides methods of detecting Streptococcus 
pneumoniae polynucleotides and polypeptides as well as kits for diagnosing 
Streptococcus pneumoniae infection. 

30 Other embodiments provide a method for the detection and/or identification of 

Streptococcus pneumoniae in a biological sample comprising contacting the sample 
with an oligonucleotide probe of a polynucleotide comprising the nucleotide 
sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 
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NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, or a fragment 
thereof, under conditions permitting hybridization and detecting the presence of 
hybridization complexes in the sample, wherein hybridization complexes indicate the 
presence of Streptococcus pneumoniae in the sample. 
5 Still other embodiments provide a method for the detection and/or 

identification of Streptococcus pneumoniae in a biological sample comprising a 
nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or 
SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, or a 
fragment thereof, in the presence of nucleotides and a polymerase enzyme under 

10 conditions permitting primer extension and detecting the presence of primer 
extension products in the sample, wherein extension products indicate the presence 
of Streptococcus pneumoniae in the sample. 

Further embodiments provide a method for the detection and/or identification 
of Streptococcus pneumoniae in a biological sample comprising contacting the 

15 sample with an antibody specific for a polypeptide comprising an amino acid 
sequence chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID 
NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or a fragment 
thereof, under conditions permitting immune complex formation and detecting the 
presence of immune complexes in the sample, wherein immune complexes indicate 

20 the presence of Streptococcus pneumoniae in the sample. 

In certain embodiments, provided is a method for the detection and/or 
identification of antibodies to Streptococcus pneumoniae in a biological sample 
comprising contacting the sample with a polypeptide comprising an amino acid 
sequence chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID 

25 NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or a fragment 
thereof, under conditions permitting immune complex formation and detecting the 
presence of immune complexes in the sample, wherein immune complexes indicate 
the presence of Streptococcus pneumoniae in the sample. 

Other embodiments of the invention provide a kit comprising a container 

30 containing an isolated polynucleotide comprising an nucleotide sequence chosen 
from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through 
SEQ ID NO: 591 , a degenerate variant thereof, or a fragment thereof. In a preferred 
embodiment, the polynucleotide is a primer or a probe, wherein when the 
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polynucleotide is a primer, the kit further comprises a container containing a 
polymerase. In another embodiment, the kit further comprises a container containing 
dNTP. 

Provided further is a kit comprising a container containing an antibody that 
5 immunospecifically binds to a polypeptide comprising the amino acid sequence 
chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 
through SEQ ID NO: 752, a biological equivalent thereof, or a fragment thereof. 

Provided also is a kit comprising a container containing an antibody that 
immunospecifically binds to a fusion polypeptide comprising at least the amino acid 
10 sequence chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID 
NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or a fragment 
thereof. 

In a preferred embodiment of the invention, provided is a genetically 
engineered host cell, transfected, transformed or infected with a recombinant 

15 expression vector comprising a nucleotide sequence having at least about 95% 
identity to a nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID 
NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, 
or a fragment thereof under conditions suitable to produce one of the polypeptides of 
SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 

20 752; and recovering the polypeptide. 

Other features and advantages of the invention will be apparent from the 
following detailed description, from the preferred embodiments thereof, and from the 
claims. 

25 

Detailed Description of the Invention 

The invention described hereinafter addresses the need for Streptococcus 
pneumoniae immunogenic compositions that effectively prevent or treat most or all of 
the disease caused by serotypes of Streptococcus pneumoniae. The invention 
30 further addresses the need for methods of diagnosing Streptococcus pneumoniae 
infection. The present invention has identified novel Streptococcus pneumoniae 
open reading frames, hereinafter ORFs, which encode antigenic polypeptides. More 
particularly, the newly identified ORFs encode polypeptides that are secreted, 
exposed, membrane associated or surface localized on Streptococcus pneumoniae, 
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and thus serve as potential antigenic polypeptides in immunogenic compositions. 
Thus, in certain embodiments, the invention comprises Streptococcus pneumoniae 
polynucleotide ORFs encoding surface localized, exposed, secreted or membrane 
associated polypeptide antigens. The present invention therefore comprises in other 
5 embodiments, these polypeptides, preferably antigenic polypeptides, encoded by the 
Streptococcus pneumoniae ORFs. 

In other embodiments, the invention comprises vectors comprising ORF 
sequences and host cells or animals transformed, transfected or infected with these 
vectors. The invention also comprises transcriptional gene products of 

10 Streptococcus pneumoniae ORFs, such as, for example, mRNA, antisense RNA, 
antisense oligonucleotides and ribozyme molecules, which can be used to inhibit or 
control growth of the microorganism. The invention relates also to methods of 
detecting these nucleic acids or polypeptides and kits for diagnosing Streptococcus 
pneumoniae infection. The invention also relates to pharmaceutical compositions, in 

15 particular immunogenic compositions, for the prevention and/or treatment of bacterial 
infection, in particular infection caused by or exacerbated by Streptococcus 
pneumoniae. In particular embodiments, the immunogenic compositions are used for 
the treatment or prevention of systemic diseases which are induced or exacerbated 
by Streptococcus pneumoniae. In other embodiments, the immunogenic 

20 compositions are used for the treatment or prevention of non-systemic diseases, 
particularly of the otitis media, which are induced or exacerbated by Streptococcus 
pneumoniae. 

A. Identifying ORFs within the Genomic Sequence of Streptococcus 

25 PNEUMONIAE 

The invention is directed in particular embodiments to the identification of 
polynucleotides, more particularly ORFs, that encode Sfrepfococcus pneumoniae 
polypeptides. The availability of complete bacterial genome sequences has begun to 
play an important role in the identification of candidate antigens through genomics, 
30 transcriptional profiling, and proteomics, coupled with the information processing 
capabilities of bioinformatics (McAfee etal., 1998a; McAtee etai, 1998b; Pizza et at., 
2000; Sonnenberg and Belisle, 1997; Weldingh et ai, 1998; McAtee et al., 1998c). 
Currently, no more than approximately 60% of all ORFs within a bacterial genome 
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have some match with a polypeptide whose function has been determined. This 
leaves approximately 40% of genomic ORFs uncharacterized. Thus, the inventors 
have analyzed the Streptococcus pneumoniae genome and utilized bioinformatic 
tools to identify novel ORFs encoding polypeptides of the present invention. In 
5 addition to genomic analysis, the inventors analyzed the Streptococcus pneumoniae 
membrane proteome component to identify novel and/or confirm ORFs encoding 
polypeptides of the present invention. As described below, the ORFs were analyzed 
for a variety of characteristics. 

Specifically, an extensive genomic analysis was performed in silico of the 

10 Streptococcus pneumoniae type 4 genome from The Institute for Genomic Research 
(TIGR) using algorithms designed to identify genes that encode novel surface 
localized polypeptides or polypeptides with putative similarity to polypeptides of 
known interest in other organisms. Thus, a combined analysis of the Streptococcus 
pneumoniae genome, using a unique set of two ORF finder algorithms {i.e., 

15 GLIMMER, Salzberg et al., 1998 and inventors' assignee's own program), produced 
3,799 ORFs. The most stringent of the ORF finders; Glimmer, produced 2,022 
ORFs, while the assignee's ORF finder produced the most with 3,798 ORFs. There 
were 2,021 ORFs identified by the two algorithms. The difference in results between 
the different ORF finders is primarily due to the particular start codons used by each 

20 program; however, Glimmer also incorporates some evaluation for a Shine-Dalgarno 
box and an interpolated Markov model. For the purposes here, all ORFs with 
common stop codons are given the same ORF designation and will be treated as if 
they are the same ORF. As used hereinafter, an ORF is defined as having one of 
three potential start site codons, ATG, GTG or TTG and one of three potential stop 

25 codons, TAA, TAG or TGA. The lower limit of amino acid length selected as a cutoff 
(e.g., -74 amino acids) may also cause the algorithms to overlook some reading 
frames. However, these "true" reading frames become an increasingly rare event as 
the ORFs become shorter. 

The initial annotation of the Streptococcus pneumoniae ORFs was performed 

30 using the Basic Local Alignment Search Tool (BLAST; version 2.0) Gapped search 
algorithm, Blastp, to identify homologous sequences (Altschul et al., 1997). A cutoff 
'e' value of anything < e" 10 was considered significant. The non-redundant protein 
sequence database used for the homology searches consisted of GenBank, SWISS- 
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PROT (Bairoch and Apweiler, 2000), PIR (Barker et a/., 2001), and TREMBL 
(Bairoch and Apweiler, 2000); whose database sequences are updated daily. In the 
present invention, ORFs with a Blastp result of > e" 10 are considered to be unique to 
Streptococcus pneumoniae. Alternate quantitative expression values other than 
5 Blastp 'e', e.g., percent identity, may also be used to compare database sequences 
with the Steptococcus pneumoniae ORFs of the present invention. 

A keyword search of the entire BLAST results was carried out using known or 
suspected target genes for immunogenic compositions as well as words that 
identified the location of a protein or function. 

10 Several parameters were used to determine grouping of the predicted 

Streptococcus pneumoniae polypeptides of the invention. For example, polypeptides 
destined for translocation across the cytoplasmic membrane encode a leader signal 
(also called signal sequence) composed of a central hydrophobic region flanked at 
the N-terminus by positively charged residues (Pugsley, 1 993). A software program, 

15 called SignalP, which identifies signal peptides and their cleavage sites based on 
neural networks (Nielsen et ai, 1997), was used in the present invention to analyze 
the amino acid sequence of an ORF for such a signal peptide. The first 60 N- 
terminal amino acids of each ORF were analyzed by SignalP using the Gram-positive 
software database. The output generated four separate values, maximum C, 

20 maximum Y, maximum S, and mean S. The S-score, or signal region, is the 
probability of the position belonging to the signal peptide. The C-score, or cleavage 
site, is the probability of the position being the first in the mature protein. The Y- 
score is the geometric average of the C-score and a smoothed derivative of the S- 
score. A conclusion of either a Yes or No is given next to each score. If all four 

25 conclusions are Yes, then a 'YES' is listed for that ORF; if three of the conclusions 
are Yes, then a 'yes' is listed for that ORF; if two of the conclusions are Yes, then a 
'maybe' is listed for that ORF; for all other cases, a 'no' is listed for that ORF. 

To predict polypeptide localization in bacteria, the software program PSORT 
was used (Nakai, 1991). PSORT predicts localization of polypeptides to the 

30 'cytoplasm', 'periplasm', and/or 'cytoplasmic membrane' for Gram-positive bacteria, 
as well as 'outer membrane' for Gram-negative bacteria. Transmembrane (TM) 
domains of polypeptides were analyzed using the software program TopPred II 
(Cserzo etal., 1997). 
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The Hidden Markov Model (HMM) Pfam database (Bateman, 2000) was used 
to identify Streptococcus pneumoniae proteins that may belong to an existing protein 
family. Keyword searching of this output was further used to help identify additional 
candidate antigens that may have been missed by the BLAST search criteria. 
5 A computer algorithm, called HMM Lipo, was developed by inventors' 

assignee to predict lipoproteins using approximately 131 biologically proven bacterial 
lipoproteins. The protein sequence from the start of the protein to the cysteine amino 
acid, plus the next two additional amino acids, was used to generate the HMM (Eddy 
and Markov, 1996) 

10 The inventor's assignee's also developed a HMM using approximately 70 

known prokaryotic proteins containing the LPXTG cell wall sorting signal, to predict 
cell wall proteins that are anchored to the peptidoglycan layer (Mazmanian et a/., 
1999; Navarre and Schneewind, 1999). The model used not only the LPXTG 
sequence, but also included two features of the downstream sequence, first the 

15 hydrophobic transmembrane domain and secondly, the positively charged carboxy 
terminus. There are also a number of proteins that interact, non-covalently, with the 
peptidoglycan layer and are distinct from the LPXTG protein class described above. 
These proteins seem to have a consensus sequence at their carboxy terminus 
(Koebnik, 1995). The inventors therefore developed and used a HMM of this region 

20 to identify any Streptococcus pneumoniae that may fall into this class of proteins. 

Streptococcus pneumoniae ORFs encoding surface localized, exposed, or 
membrane associated polypeptides were also identified by proteomics (see, Example 
3). This proteomic analysis confirmed many of the Streptococcus pneumoniae ORFs 
identified by the above genomic analysis and further identified novel Streptococcus 

25 pneumoniae ORFs encoding membrane associated polypeptides. 

The following Tables (i.e., Tables 1-12) represent 12 groups into which the 
ORFs identified according to the above characteristics of present invention have 
been classified. Thus, all of the groups described below are ORFs comprised within 
the Streptococcus pneumoniae genome and identified as encoding putative surface 

30 localized, exposed, membrane associated or secreted polypeptides. These groups 
are not meant to limit the scope of the present invention, as analysis of additional 
ORF characteristics also are contemplated. These additional characteristics, e.g., 
RGD sequence, may serve to further expand the total number of ORF groupings or 
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to parse the presently identified ORFs into more defined groups, broader groups, 
narrower groups or group subsets. In addition, some ORFs will meet the criteria of 
more than one category, and will therefore appear in more than one of the following 
groups. 

5 Listed in Table 1 are ORFs that comprise a cytoplasmic membrane signal 

sequence {i.e., a SignalP value of 'YES') and have one or fewer membrane spanning 
domains (MSD), as defined by the TopPred II program. Thirteen ORFs are found 
that match these criteria and are considered to be surface exposed. 
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Table 1. ORFs encoding surface exposed polypeptides, SignalP value 
= 'YES' and <1 MSDs. 



SEQ ID 


ORF 


11 


190 


17 


403 


23 


469 


39 790 


50 


935 


70 


1143 


83 


1475 


91 


1568 


97 


1724 


128 


2271 


148 


2621 


179 


3212 


209 


3600 



5 

Listed in Table 2 are ORFs that comprise a cytoplasmic membrane signal 
sequence {i.e., a SignalP value of 'YES') and an outer membrane (OM) or 
periplasmic (Peri) prediction value when analyzed via the program Psort. Five ORFs 
are found that match these criteria and are considered to be surface exposed. 

10 

Table 2. ORFs encoding surface exposed polypeptides, a SignalP value 
= 'YES' and a Psort value of 'OM or Peri'. 



SEQ ID 


ORF 


23 469 


39 


790 


50 935 


125 


2228 


179 


3212 
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Listed in Table 3 are ORFs that comprise a cytoplasmic membrane signal 
sequence (i.e., a SignalP value of 'YES') and have 2 or more membrane spanning 
domains (MSD), as defined by the TopPred II program. Twenty two ORFs are found 
that match these criteria and are considered to be surface exposed. 

5 

Table 3. ORFs encoding surface exposed polypeptides, a SignalP = 'YES' 
and < 1 MSDs. 



SEQID 


ORF 


11 


190 


13 


339 


17 


403 


23 469 


34 


640 


39 


790 


50 


935 


70 


1143 


73 


1207 


83 1475 


91 


1568 


97 


1724 


106 


1947 


121 


2196 


125 


2228 


126 


2234 


128 


2271 


148 


2621 


179 


3212 


187 


3361 


192 


3384 


209 


3600 



10 Listed in Table 4 are ORFs that comprise at least 3 of 4 SignalP values {i.e., a 

SignalP value of 'yes') and have 2 or more membrane spanning domains (MSD), as 
defined by the TopPred II program. Forty-nine ORFs are found that match these 
criteria and are considered to be surface exposed. 
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Table 4. ORFs encoding surface exposed polypeptides, a SignalP = 'yes' 
and >2 MSDs. 



SEQID 


ORF 


SEQID 


ORF 


2 


72 


129 


2304 


6 


94 


133 


2350 


10 


141 


140 


2470 


14 


356 


145 


2594 


22 


462 


146 


2613 


28 


597 


152 


2676 


29 


598 


156 


2838 


36 


715 


168 


3072 


37 


716 


175 


3141 


40 


823 


180 


3256 


46 


885 


184 


3340 


47 


904 


188 


3369 


48 


916 


190 


3373 


56 


989 


194 


3386 


59 


998 


203 


3558 


71 


1178 


211 


3631 


77 


1339 


213 


3770 


80 


1412 


215 


3799 


81 


1437 






86 1493 


87 


1528 






88 


1530 






93 


1623 






99 1816 


101 


1849 






102 


1863 






105 


1904 






112 


2026 






114 


2061 






115 


2112 






120 


2195 







5 

Keyword search of the Blastp data for putative surface exposed proteins 
produced 119 ORFs and are listed in Table 5. 
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Table 5. ORFs encoding surface exposed polypeptides identified by keyword 
search of Blastp data. 



SEQID 


ORF 


SEQ ID 


ORF 


SEQID 


ORF 


SEQ ID ORF 


1 


51 


88 


1530 


158 


2847 


213 3770 


2 


72 


90 


1560 


159 


2894 


214 3789 


7 


113 


94 


1630 


160 


2969 




10 


141 


95 


1632 


161 


2975 




12 


304 


96 


1710 


162 


2979 




16 


378 


98 


1765 


163 


2980 




20 


410 


100 


1835 


165 


3039 




24 


493 


103 


1864 


166 


3040 




27, 


580 


105 


1904 


167 


3060 




30 


607 


107 


1966 


169 


3079 




31 


612 


108 


1999 


172 


3107 




32 


624 


109 


2001 


173 


3115 




I 33 


639 


112 


2026 


176 


3167 




34 


640 


113 


2027 


177 


3198 




35 


703 


115 


2112 


178 


3209 




38 


772 


117 


2132 


180 


3256 




40 


823 


118 


2191 


181 


3262 




42 


838 


122 


2198 


182 


3298 




43 


854 


123 


2201 


184 


3340 




44 


855 


124 


2215 


185 


3346 




48 


916 


127 


2239 


186 


3349 




51 


945 


129 


2304 


188 


3369 




53 


979 


131 


2329 


189 


3372 




59 


998 


132 


2348 


191 


3378 




60 


1013 


133 


2350 


193 


3385 




61 


1048 


134 


2352 


196 


3457 




65 


1072 


135 


2354 


197 


3473 




67 


1104 


136 


2385 


198 


3479 




68 


1117 


138 


2431 


199 


3480 




69 


1141 


139 


2452 


200 


3487 




70 


1143 


141 


2488 


201 


3493 




71 


1178 


144 


2591 


202 


3494 




75 


1244 


146 


2613 


204 


3568 




76 


1267 


147 


2615 


205 


3576 




77 


1339 


151 


2661 


206 


3578 




78 


1350 


152 


2676 


207 


3584 




79 


1410 


154 


2734 


208 


3585 




80 


1412 


155 


2814 


210 


3627 




87 


1528 


157 


2845 


212 


3669 
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HMM Pfam analysis helps identify ORFs encoding proteins with domains or 
amino acid patterns similar to proteins that belong to an existing protein family. 
Keyword search of the Pfam family classification for potential surface exposed 
proteins produced 52 ORFs and are listed in Table 6. 

5 

Table 6. ORFs encoding surface exposed polypeptides identified by HMM 
Pfam analysis. 



SEQ ID 


ORF 


SEQ ID 


ORF 


4 


79 


160 


2969 


18 


404 


162 


2979 


19 


406 


163 


2980 


41 


828 


164 


2983 


45 


869 


165 


3039 


55 


983 


166 


3040 


57 


992 


169 


3079 


58 


996 


171 


3083 


63 


1064 


174 


3140 


64 


1070 


176 


3167 


66 


1097 


180 


3256 


72 


1179 


182 


3298 


74 


1220 


183 


3327 


89 


1559 


184 


3340 


92 


1572 


186 


3349 


104 


1868 


188 


3369 


111 


2025 


189 


3372 


116 


2129 


195 


3413 


119 


2193 


198 


3479 


128 


2271 


199 


3480 


137 


2400 


205 


3576 


142 


2499 


212 


3669 


143 


2543 


213 


3770 


149 


2642 






151 


2661 






152 


2676 






153 


2678 






157 


2845 






159 


2894 
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An algorithm called HMM Lipo was developed for use in the present 
invention. The HMM Lipo program predicts lipoproteins using approximately 131 
biologically proven bacterial lipoproteins. HMM Lipo identified 16 ORFs that are 
putative lipoproteins and are listed in Table 7. 

5 

Table 7. ORFs encoding surface exposed lipoproteins. 



SEQID 


ORF 


3 


75 


8 132 


9 140 


13 


339 


21 


423 


26 


502 


34 


640 


62 1059 


67 


1104 


85 


1479 


134 


2352 


147 


2615 


150 


2655 


168 


3072 


170 


3081 


173 


3115 



10 The inventors developed an HMM using approximately 70 known prokaryotic 

polypeptides containing the LPXTG cell wall sorting signal. Thus, this HMM was 
used to predict cell wall polypeptides that are anchored to the peptidoglycan layer. 
Listed in Table 8 are 4 ORFs predicted to have the LPXTG motif and are classified 
as proteins that might be targeted by sortase. 
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Table 8. ORFs encoding surface exposed polypeptides anchored to the 
peptidoglycan layer. 



SEQID 


ORF 


13 339 


21 


423 


34 


640 


170 


3081 



5 ' ~ 

In addition, listed in Table 9 are 3 ORFs predicted by HMM PGB analysis to 
encode polypeptides potentially binding to the peptidoglycan layer in a manner 
independently of the sortase. 

10 

Table 9. ORFs encoding surface exposed polypeptides non-covalently 
anchored to the peptidoglycan layer. 



SEQID 


ORF 


25 


494 


49 927 


110 


2012 



ORFs that give a SignalP value of 'YES' and whose carboxy terminal amino 
acid is either a Phenylalanine or Tyrosine are considered to be surface exposed. 
20 Listed in Table 10 are 7 ORFs matching these criteria. 
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Table 10. ORFs encoding surface exposed polypeptides, a cytoplasmic 
membrane signal sequence (i.e., SignalP = 'YES') and a C-terminal Phe or Tyr amino 
acid. 



SEQID 


ORF 


11 


190 


39 790 


73 


1207 


97 


1724 


106 


1947 


125 


2228 


187 


3361 



10 

Twenty eight Streptococcus pneumoniae ORFs were additionally identified by 
proteomics as encoding membrane associated polypeptides and are listed in Table 
1 1 . The ORFs listed in Table 1 1 further support the Streptococcus pneumoniae 
ORFs identified by the genomic mining algorithms described above [i.e., ORFs 
15 encoding surface localized, secreted, or exposed polypeptides; Tables 1-10). 
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Table 11. Streptococcus pneumoniae ORFs confirmed by proteomics as 
surface exposed. 



SEQ ID 


ORF 


14 


356 


16 


378 


17 


403 


46 885 


64 


1070 


66 1097 


67 


1104 


69 1141 


71 


1178 


74 


1220 


91 


1568 


103 


1864 


116 


2129 


128 


2271 


131 


2329 


136 2385 


151 


2661 


156 


2838 


159 


2894 


162 


2979 


164 


2983 


172 


3107 


176 


3167 


178 


3209 


179 


3212 


180 


3256 


182 


3298 


205 


3576 



5 

Finally, 161 novel Streptococcus pneumoniae ORFs were identified by 
proteomics as encoding membrane associated polypeptides and are listed in Table 
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Table 12. Streptococcus pneumoniae ORFs identified by proteomics 
membrane associated. 



SEQ ID 


ORF 


SEQ ID 


ORF 


SEQ ID 


ORF 


SEQ ID 


ORF 


431 


64 


463 


357 


495 


1344 


527 


2284 


432 


120 


464 


390 


496 


1347 


528 


2315 


433 


121 


465 


431 


497 


1356 


529 


2317 


434 


152 


466 


434 


498 


1417 


530 


2318 


435 


153 


467 


436 


499 


1465 


531 


2319 


436 


156 


468 


439 


500 


1477 


532 


2320 


437 


159 


469 


513 


501 


1515 


533 


2372 


438 


160 


470 


515 


502 


1527 


534 


2374 


439 


163 


471 


583 


503 


1565 


535 


2376 


440 


164 


472 


633 


504 


1601 


536 


2387 


441 


166 


473 


683 


505 


1606 


537 


2394 


442 


172 


474 


686 


506 


1641 


538 


2410 


443 


174 


475 


720 


507 


1770 


539 


2425 


444 


175 


476 


726 


508 


1773 


540 


2443 


445 


178 


477 


818 


509 


1774 


541 


2451 


446 


180 


478 


861 


510 


1785 


542 


2454 


447 


181 


479 


863 


511 


1803 


543 


2508 


448 


183 


480 


960 


512 


1817 


544 


2513 


449 


186 


481 


1004 


513 


1823 


545 


2542 


450 


188 


482 


1037 


514 


1847 


546 


2558 


451 


189 


483 


1049 


515 


1917 


547 


2568 


452 


192 


484 


1054 


516 


1923 


548 


2575 


453 


194 


485 


1061 


517 


1964 


549 


2587 


454 


199 


486 


1082 


518 


1970 


550 


2754 


455 


268 


487 


1105 


519 


2039 


551 


2800 


456 


269 


488 


1111 


520 


2041 


552 


2839 


457 


294 


489 


1175 


521 


2047 


553 


2892 


458 


296 


490 


1248 


522 


2058 


554 


2906 


459 


298 


491 


1262 


523 


2068 


555 


2958 


460 


301 


492 


1266 


524 


2130 


556 


2963 


461 


' 316 


493 


1312 


525 


2251 


557 


3021 


462 


320 


494 


1314 


526 


2282 


558 


3048 
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Table 12. Streptococcus pneumoniae ORFs identified by proteomics as 
membrane associated. 



SEQID 


ORF 


SEQID 


ORF 


SEQ ID 


ORF 


SEQID 


ORF 


559 


3065 


569 


3248 


579 


3552 


589 


3739 


560 


3095 


570 


3303 


580 


3555 


590 


3766 


561 


3111 


571 


3331 


581 


3560 


591 


3778 


562 


3125 


572 


3367 


582 


3564 






563 


3151 


573 


3410 


583 


3566 






564 


3153 


574 


3446 


584 


3632 






565 


3161 


575 


3454 


585 


3653 






566 


3178 


576 


3525 


586 


3714 






567 


3180 


577 


3538 


587 


3732 






568 


3234 


578 


3540 


588 


3735 







5 

As further contemplated in the present invention, Streptococcus pneumoniae 
ORFs are searched and evaluated for other important characteristics. For example, 
proteins that contain the Arg-Gly-Asp (RGD) attachment motif, together with integrins 
that serve as their receptor, constitute a major recognition system for cell adhesion, 

10 and thus are putative Streptococcus pneumoniae polypeptide antigens. Four 
Streptococcus pneumoniae ORFs, i.e., ORF 51, ORF 423, ORF 1097 and ORF 
1104, have been identified as having a tripeptide RGD sequence that potentially is 
involved in cell attachment. 

ORFs RGD recognition is one mechanism used by microbes to gain entry into 

15 eukaryotic tissues (Stockbauer et a/., 1999; Isberg and Nhieu, 1994). However, not 
all RGD-containing proteins mediate cell attachment. It has been shown that RGD- 
containing peptides with a proline at the carboxy end (RGDP) are inactive in cell 
attachment assays (Pierschbacher and Rouslahti, 1987) and are excluded. A 
tandem repeat finder (Benson, 1999) may also be used, as has been used to identify 

20 ORFs containing repeated DNA sequences such as those found in MSCRAMMs 
(Foster and Hook, 1998) and phase variable surface proteins of Neisseria 
meningitidis (Parkhill etal., 2000). 

The present inventors also have used the Geanfammer software to cluster 
proteins into homologous families (Park and Teichmann, 1998). Preliminary analysis 
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of the family classes has provided novel ORFs within a vaccine candidate cluster as 
well as defining potential protein function. 

The ORFs listed in Table 13, were identified by analysis of the Streptococcus 
pneumoniae genome. A total of 215 ORFs were identified based on the analysis 

5 criteria described above and listed in Tables 1-10. The 215 ORFs identified are 
listed vertically in Table 13 (column 1). The nucleotide SEQ ID NOS: 1 through SEQ 
ID NOS: 215 (column 2) and the encoded polypeptide SEQ ID NOS: 216 through 
SEQ ID NOS: 430 (column 3) are listed horizontally to their respective ORF. For 
example, in Table 13, ORF 51 has the nucleotide sequence of SEQ ID NO:1 and the 

10 encoded polypeptide has the amino acid sequence of SEQ ID NO: 216, ORF 72 has 
nucleotide SEQ ID NO:2 and encoded polypeptide SEQ ID NO: 217, efc. 

Proteomic analysis identified twenty eight ORFs (see, Table 1 1) already listed 
in Table 13 (e.g., SEQ ID NO: 14, SEQ ID NO:16, SEQ ID NO:27, efc.) Proteomic 
analysis further identified 161 novel ORFs encoding membrane associated proteins 

15 (see, Table 12). These 161 novel ORFs identified by proteomics as membrane 
associated are listed vertically in Table 14 (column 1). The nucleotide SEQ ID NOS: 
431 through SEQ ID NO: 591 (column 2) and the encoded polypeptide SEQ ID NOS: 
592 through 752 (column 3) are listed horizontally to their respective ORF. 
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Table 13 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


51 


1 


216 


72 


2 


217 


75 


3 


218 


79 


4 


219 


86 


5 


220 


94 6 221 


113 


7 


222 


132 


8 


223 


140 


9 


224 


141 


10 


225 


190 


11 


226 


304 


12 


227 


339 


13 


228 


356 


14 


229 


370 


15 


230 


378 


16 


231 


403 


17 


232 


404 


18 


233 


406 


19 


234 


410 


20 


235 


423 


21 


236 


462 


22 


237 


469 


23 


238 


493 


24 


239 


494 


25 


240 


502 


26 


241 


580 


27 


242 


597 


28 


243 


598 


29 


244 


607 


30 


245 


612 


31 


246 


624 


32 


247 


639 33 248 


640 


34 


249 


703 


35 


250 


715 


36 


251 


716 


37 


252 


772 


38 


253 


790 


39 


254 


823 


40 


255 
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. Streptococcus Pneumoniae open reading frames (ORFs) 



Table 13. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


823 


40 


255 


828 


41 


256 


838 


42 


257 


854 


43 


258 


855 


44 


259 


869 


45 


260 


885 


46 


261 


904 


47 


262 


916 


48 


263 


927 


49 


264 


935 


50 


265 


945 


51 


266 


965 


52 


267 


979 


53 


268 


980 


54 


269 


983 


55 


270 


989 56 271 


992 


57 


272 


996 


58 


273 


998 59 274 


1013 


60 


275 


1048 


61 


276 


1059 


62 


277 


1064 


63 


278 


1070 


64 


279 


1072 


65 


280 


1097 


66 


281 


1104 


67 


282 


1117 


68 


283 


1141 


69 


284 


1143 


70 


285 


1178 


71 


286 


1179 


72 


287 


1207 


73 


288 


1220 


74 


289 


1244 


75 


290 


1267 


76 


291 


1339 


77 


292 


1350 


78 


293 


1410 


79 


294 



-42- 



Table 13. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


1412 


80 


295 


1437 


81 


296 


1459 


82 


297 


1475 


83 


298 


1476 


84 


299 


1479 


85 


300 


1493 


86 


301 


1528 


87 


302 


1530 


88 


303 


1559 


89 


304 


1560 


90 


305 


1568 


91 


306 


1572 


92 


307 


1623 


93 


308 


1630 


94 


309 


1632 


95 


310 


1710 


96 


311 


1724 


97 


312 


1765 


98 


313 


1816 


99 


314 


1835 


100 


315 


1849 


101 


316 


1863 


102 


317 


1864 


103 


318 


1868 


104 


319 


1904 


105 


320 


1947 


106 


321 


1966 


107 


322 


1999 


108 


323 


2001 


109 


324 


2012 


110 


325 


2025 


111 


326 


2026 


112 


327 


2027 


113 


328 


2061 


114 


329 


2112 


115 


330 


2129 


116 


331 


2132 


117 


332 


2191 


118 


333 


2193 


119 


334 
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Table 13. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


2195 


120 


335 


2196 


121 


336 


2198 


122 


337 


2201 


123 


338 


2215 


124 


339 


2228 


125 


340 


2234 


126 


341 


2239 


127 


342 


2271 


128 


343 


2304 


129 


344 


2322 


130 


345 


2329 


131 


346 


2348 


132 


347 


2350 


133 


348 


2352 


134 


349 


2354 


135 


350 


2385 


136 


351 


2400 


137 


352 


2431 


138 


353 


2452 


139 


354 


2470 


140 


355 


2488 


141 


356 


2499 


142 


357 


2543 


143 


358 


2591 


144 


359 


2594 


145 


360 


2613 


146 


361 


2615 


147 


362 


2621 


148 


363 


2642 


149 


364 


2655 


150 


365 


2661 


151 


366 


2676 


152 


367 


2678 


153 


368 


2734 


154 


369 


2814 


155 


370 


2838 


156 


371 


2845 


157 


372 


2847 


158 


373 


2894 


159 


374 
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Table 13. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


2969 


160 


375 


2975 


161 


376 


2979 


162 


377 


2980 


163 


378 


2983 


164 


379 


3039 


165 


380 


3040 


166 


381 


3060 


167 


382 


3072 


168 


383 


3079 


169 


384 


3081 


170 


385 


3083 


171 


386 


3107 


172 


387 


3115 


173 


388 


3140 


174 


389 


3141 


175 


390 


3167 


176 


391 


3198 


177 


392 


3209 


178 


393 


3212 


179 


394 


3256 


180 


395 


3262 


181 


396 


3298 


182 


397 


3327 


183 


398 


3340 


184 


399 


3346 


185 


400 


3349 


186 


401 


3361 


187 


402 


3369 


188 


403 


3372 


189 


404 


3373 


190 


405 


3378 


191 


406 


3384 


192 


407 


3385 


193 


408 


3386 


194 


409 


3413 


195 


410 


3457 


196 


411 


3473 


197 


412 


3479 


198 


413 


3480 


199 


414 
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Table 13. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


3487 


200 


415 


3493 


201 


416 


3494 


202 


417 


3558 


203 


418 


3568 


204 


419 


3576 


205 


420 


3578 


206 


421 


3584 


207 


422 


3585 


208 


423 


3600 


209 


424 


3627 


210 


425 


3631 


211 


426 


3669 


212 


427 


3770 


213 


428 


3789 


214 


429 


3799 


215 


430 
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Table 14. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


64 


431 


592 


120 


432 


593 


121 


433 


594 


152 


434 


595 


153 


435 


596 


156 


436 


597 


159 


437 


598 


160 438 599 


163 


439 


600 


164 


440 


601 


166 


441 


602 


172 


442 


603 


174 


443 


604 


175 


444 


605 


178 


445 


606 


180 


446 


607 


181 


447 


608 


183 


448 


609 


186 


449 


610 


188 


450 


611 


189 


451 


612 


192 


452 


613 


194 


453 


614 


199 


454 


615 


268 


455 


616 


269 


456 


617 


294 


457 


618 


296 458 619 


298 


459 


620 


301 


460 


621 


316 


461 


622 


320 


462 


623 


357 


463 


624 


390 


464 


625 


431 


465 


626 


434 


466 


627 


436 


467 


628 
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Table 14. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


— 439— 




629 


— *rT~ 


T~ 


630 


— emr — 


T~ 

70 


631 


3 


471 


632 




472 


633 


683 


473 


634 


686 


474 


635 


720 


475 


636 


726 


476 


637 


818 


477 


638 


861 


478 


639 


863 


479 


640 


960 480 641 


1004 


481 


642 


1037 


482 


643 


1049 


483 


644 


1054 


484 


645 


1061 


485 


646 


1082 


486 


647 


1105 


487 


648 


1111 


488 


649 


1175 


489 


650 


1248 


490 


651 


1262 


491 


652 


1266 


492 


653 


1312 


493 


654 


1314 


494 


655 


1344 


495 


656 


1347 


496 


657 


1356 


497 


658 


1417 


498 


659 


1465 


499 


660 


1477 


500 


661 


1515 


501 


662 


1527 


502 


663 


1565 


503 


664 


1601 


504 


665 
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Table 14. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


1606 


505 


666 


1641 


506 


667 


1770 


507 


668 


1773 


508 


669 


1774 


509 


670 


1785 


510 


671 


1803 


511 


672 


1817 


512 


673 


1823 


513 


674 


1847 


514 


675 


1917 


515 


676 


1923 


516 


677 


1964 


517 


678 


1970 


518 


679 


2039 


519 


680 


2041 


520 


681 


2047 


521 


682 


2058 


522 


683 


2068 


523 


684 


2130 


524 


685 


2251 


525 


686 


2282 


526 


687 


2284 


527 


688 


2315 


528 


689 


2317 


529 


690 


2318 


530 


691 


2319 


531 


692 


2320 


532 


693 


2372 


533 


694 


2374 


534 


695 


2376 535 696 


2387 


536 


697 


2394 


537 


698 


2410 


538 


699 


2425 


539 


700 


2443 


540 


701 


2451 


541 


702 
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Table 14. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


2454 


542 


703 


2508 


543 


704 


2513 


544 


705 


2542 


545 


706 


2558 


546 


707 


2568 


547 


708 


2575 


548 


709 


2587 


549 


710 


2754 


550 


711 


2800 


551 


712 


2839 


552 


713 


2892 


553 


714 


2906 


554 


715 


2958 


555 


716 


2963 


556 


717 


3021 


557 


718 


3048 


558 


719 


3065 


559 


720 


3095 


560 


721 


3111 


561 


722 


3125 


562 


723 


3151 


563 


724 


3153 


564 


725 


3161 


565 


726 


3178 


566 


727 


3180 


567 


728 


3234 


568 


729 


3248 


569 


730 


3303 


570 


731 


3331 


571 


732 


3367 


572 


733 


3410 


573 


734 


3446 


574 


735 


3454 


575 


736 


3525 


576 


737 


3538 


577 


738 


3540 


578 


739 
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Table 14. Streptococcus Pneumoniae open reading frames (ORFs) 



ORF 


Nucleotide 
SEQ ID NO 


Polypeptide 
SEQ ID NO 


3552 


579 


740 


3555 


580 


741 


3560 


581 


742 


3564 


582 


743 


3566 


583 


744 


3632 


584 


745 


3653 


585 


746 


3714 


586 


747 


3732 


587 


748 


3735 


588 


749 


3739 


589 


750 


3766 


590 


751 


3778 


591 


752 



B. Streptococcus pneumoniae ORF Polynucleotides Encoding Surface 
5 Exposed Polypeptides 

Isolated and purified Streptococcus pneumoniae ORF polynucleotides of the 
present invention are contemplated for use in the production of Streptococcus 
pneumoniae polypeptides. More specifically, in certain embodiments, the ORFs 
encode Streptococcus pneumoniae surface localized, exposed, membrane 

10 associated or secreted polypeptides, particularly antigenic polypeptides. Thus, in 
one aspect, the present invention provides isolated and purified polynucleotides 
(ORFs) that encode Streptococcus pneumoniae surface localized, exposed, 
membrane associated or secreted polypeptides. In particular embodiments, a 
polynucleotide of the present invention is a DNA molecule, wherein the DNA may be 

15 genomic DNA, chromosomal DNA, plasmid DNA or cDNA. In a preferred 
embodiment, a polynucleotide of the present invention is a recombinant 
polynucleotide, which encodes a Streptococcus pneumoniae polypeptide comprising 
an amino acid sequence that has at least 95% identity to an amino acid sequence of 
one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ 

20 ID NO: 752, or a fragment thereof. In another embodiment, an isolated and purified 
ORF polynucleotide comprises a nucleotide sequence that has at least 95% identity 
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to one of the ORF nucleotide sequences of SEQ ID NO: 1 through SEQ ID NO: 215 
or SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, or a 
complement thereof. In a preferred embodiment, an ORF polynucleotide of one of 
SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 
5 591 is comprised in a plasmid vector and expressed in a prokaryotic host cell. 

As used hereinafter, the term "polynucleotide" means a sequence of 
nucleotides connected by phosphodiester linkages. Polynucleotides are presented 
hereinafter in the direction from the 5' to the 3' direction. A polynucleotide of the 
present invention can comprise from about 10 to about several hundred thousand 

10 base pairs. Preferably, a polynucleotide comprises from about 10 to about 3,000 
base pairs. Preferred lengths of particular polynucleotide are set forth hereinafter. 

A polynucleotide of the present invention can be a deoxyribonucleic acid 
(DNA) molecule, a ribonucleic acid (RNA) molecule, or analogs of the DNA or RNA 
generated using nucleotide analogs. The nucleic acid molecule can be single- 

15 stranded or double-stranded, but preferably is double-stranded DNA. Where a 
polynucleotide is a DNA molecule, that molecule can be a gene, a cDNA molecule or 
a genomic DNA molecule. Nucleotide bases are indicated hereinafter by a single 
letter code: adenine (A), guanine (G), thymine (T), cytosine (C), inosine (I) and uracil 
(U). 

20 "Isolated" means altered "by the hand of man" from the natural state. If an 

"isolated" composition or substance occurs in nature, it has been changed or 
removed from its original environment, or both. For example, a polynucleotide or a 
polypeptide naturally present in a living animal is not "isolated," but the same 
polynucleotide or polypeptide separated from the coexisting materials of its natural 

25 state is "isolated," as the term is employed hereinafter. 

Preferably, an "isolated" polynucleotide is free of sequences which naturally 
flank the nucleic acid (i.e., sequences located at the 5' and 3' ends of the nucleic 
acid) in the genomic DNA of the organism from which the nucleic acid is derived. For 
example, in various embodiments, the isolated Streptococcus pneumoniae nucleic 

30 acid molecule can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0. 5 kb or 0. 1 
kb of nucleotide sequences which naturally flank the nucleic acid molecule in 
genomic DNA of the cell from which the nucleic acid is derived. However, the 
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Streptococcus pneumoniae nucleic acid molecule can be fused to other protein 
encoding or regulatory sequences and still be considered isolated. 

ORF polynucleotides of the present invention may be obtained, using 
standard cloning and screening techniques, from a cDNA library derived from mRNA. 
5 Polynucleotides of the invention can also be obtained from natural sources such as 
genomic DNA libraries (e.g., a Streptococcus pneumoniae library) or can be 
synthesized using well known and commercially available techniques. Contemplated 
in the present invention, ORF polynucleotides will be obtained using Streptococcus 
pneumoniae type 3, type 14 or type 19F chromosomal DNA as the template. 

10 The invention further encompasses nucleic acid molecules that differ from the 

nucleotide sequences shown in SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID 
NO: 431 through SEQ ID NO: 591 {and fragments thereof) due to degeneracy of the 
genetic code and thus encode the same Streptococcus pneumoniae polypeptide as 
that encoded by the nucleotide sequence shown SEQ ID NO:1 through SEQ ID 

15 NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591. 

Orthologues and allelic variants of the Streptococcus pneumoniae 
polynucleotides can readily be identified using methods well known in the art. Allelic 
variants and orthologues of the polynucleotides will comprise a nucleotide sequence 
that is typically at least about 70-75%, more typically at least about 80-85%, and 

20 most typically at least about 90-95% or more homologous to the nucleotide sequence 
shown in SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 431 through SEQ ID 
NO: 591 , or a fragment of these nucleotide sequences. Such nucleic acid molecules 
can readily be identified as being able to hybridize, preferably under stringent 
conditions, to the nucleotide sequence shown in SEQ ID NO:1 through SEQ ID 

25 NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, or a fragment of these 
nucleotide sequences. 

Moreover, the polynucleotide of the invention can comprise only a fragment of 
the coding region of a Streptococcus pneumoniae polynucleotide or gene, such as a 
fragment of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 431 

30 through SEQ ID NO: 591 . Preferably, such fragments are immunogenic fragments. 

When the ORF polynucleotides of the invention are used for the recombinant 
production of Streptococcus pneumoniae polypeptides of the present invention, the 
polynucleotide may include the coding sequence for the mature polypeptide, by itself, 



-53- 



WO 02/083855 



PCT/US02/11524 



or the coding sequence for the mature polypeptide in reading frame with other coding 
sequences, such as those encoding a leader or secretory sequence, a pre-, or pro- 
or prepro- protein sequence, or other fusion peptide portions. For example, a marker 
sequence which facilitates purification of the fused polypeptide can be linked to the 
5 coding sequence (see Gentz et al., 1989, incorporated by reference hereinafter in its 
entirety). Thus, contemplated in the present invention is the preparation of 
polynucleotides encoding fusion polypeptides permitting His-tag purification of 
expression products. The polynucleotide may also contain non-coding 5' and 3' 
sequences, such as transcribed, non-translated sequences, splicing and 

10 polyadenylation signals. 

Thus, a polynucleotide encoding a polypeptide of the present invention, 
including homologs and orthologs from species other than Streptococcus 
pneumoniae, may be obtained by a process which comprises the steps of screening 
an appropriate library under stringent hybridization conditions with a labeled probe 

15 having the sequence of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 
431 through SEQ ID NO: 591, a fragment thereof; and isolating full-length cDNA and 
genomic clones containing the polynucleotide sequence. Such hybridization 
techniques are well known to the skilled artisan. The skilled artisan will appreciate 
that, in many cases, an isolated cDNA sequence will be incomplete, in that the region 

20 coding for the polypeptide is cut short at the 5' end of the cDNA. This is a 
consequence of reverse transcriptase, an enzyme with inherently low "processivity" 
(a measure of the ability of the enzyme to remain attached to the template during the 
polymerization reaction), failing to complete a DNA copy of the mRNA template 
during 1st strand cDNA synthesis. 

25 Thus, in certain embodiments, the polynucleotide sequence information 

provided by the present invention allows for the preparation of relatively short DNA 
(or RNA) oligonucleotide sequences having the ability to specifically hybridize to 
gene sequences of the selected polynucleotides disclosed hereinafter. The term 
"oligonucleotide" as used hereinafter is defined as a molecule comprised of two or 

30 more deoxyribonucleotides or ribonucleotides, usually more than three (3), and 
typically more than ten (10) and up to one hundred (100) or more (although 
preferably between twenty and thirty). The exact size will depend on many factors, 
which in turn depends on the ultimate function or use of the oligonucleotide. Thus, in 
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particular embodiments of the invention, nucleic acid probes of an appropriate length 
are prepared based on a consideration of a selected nucleotide sequence, e.g., a 
sequence such as that shown in SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID 
NO: 431 through SEQ ID NO: 591. The ability of such nucleic acid probes to 
5 specifically hybridize to a polynucleotide encoding a Streptococcus pneumoniae 
polypeptide lends them particular utility in a variety of embodiments. Most 
importantly, the probes can be used in a variety of assays for detecting the presence 
of complementary sequences in a given sample. 

In certain embodiments, it is advantageous to use oligonucleotide primers. 

10 These primers may be generated in any manner, including chemical synthesis, DNA 
replication, reverse transcription, or a combination thereof. The sequence of such 
primers is designed using a polynucleotide of the present invention for use in 
detecting, amplifying or mutating a defined segment of an ORF polynucleotide that 
encodes a Streptococcus pneumoniae polypeptide from prokaryotic cells using 

15 polymerase chain reaction (PCR) technology. 

In certain embodiments, it is advantageous to employ a polynucleotide of the 
present invention in combination with an appropriate label for detecting hybrid 
formation. A wide variety of appropriate labels are known in the art, including 
radioactive, enzymatic or other ligands, such as avidin/biotin, which are capable of 

20 giving a detectable signal. 

Polynucleotides which are identical or sufficiently identical to a nucleotide 
sequence contained in one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 
431 through SEQ ID NO: 591, or a fragment thereof, may be used as hybridization 
probes for cDNA and genomnic DNA or as primers for a nucleic acid amplification 

25 (PCR) reaction, to isolate full-length cDNAs and genomic clones encoding 
polypeptides of the present invention and to isolate cDNA and genomic clones of 
other genes (including genes encoding homologs and orthologs from species other 
than Streptococcus pneumoniae) that have a high sequence similarity to the 
polynucleotide sequences set forth in of SEQ ID NO:1 through SEQ ID NO:215 or 

30 SEQ ID NO: 431 through SEQ ID NO: 591 , or a fragment thereof. Typically these 
nucleotide sequences are from at least about 70% identical to at least about 95% 
identical to that of the reference polynucleotide sequence. The probes or primers will 
generally comprise at least 15 nucleotides, preferably, at least 30 nucleotides and 
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may have at least 50 nucleotides. Particularly preferred probes will have between 30 
and 50 nucleotides. 

There are several methods available and well known to those skilled in the art 
to obtain full-length cDNAs, or extend short cDNAs, for example those based on the 
5 method of Rapid Amplification of cDNA ends (RACE) (see, Frohman et al., 1988). 
Recent modifications of the technique, exemplified by the Marathon™ technology 
(Clontech Laboratories Inc.) for example, have significantly simplified the search for 
longer cDNAs. In the Marathon™ technology, cDNAs have been prepared from 
mRNA extracted from a chosen tissue and an "adaptor" sequence ligated onto each 

10 end. Nucleic acid amplification (PCR) is then carried out to amplify the "missing" 5' 
end of the cDNA using a combination of gene specific and adaptor specific 
oligonucleotide primers. The PCR reaction is then repeated using "nested" primers, 
that is, primers designed to anneal within the amplified product (typically an adaptor 
specific primer that anneals further 3' in the adaptor sequence and a gene specific 

15 primer that anneals further 5' in the known gene sequence). The products of this 
reaction can then be analyzed by DNA sequencing and a full-length cDNA 
constructed either by joining the product directly to the existing cDNA to give a 
complete sequence, or carrying out a separate full-length PCR using the new 
sequence information for the design of the 5' primer. 

20 To provide certain of the advantages in accordance with the present 

invention, a preferred nucleic acid sequence employed for hybridization studies or 
assays includes probe molecules that are complementary to at least a 10 to about 70 
nucleotides long stretch of a polynucleotide that encodes a Streptococcus 
pneumoniae polypeptide, such as that shown in one of SEQ ID NO:216 through SEQ 

25 ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752. A size of at least 10 
nucleotides in length helps to ensure that the fragment will be of sufficient length to 
form a duplex molecule that is both stable and selective. Molecules having 
complementary sequences over stretches greater than 10 bases in length are 
generally preferred, though, in order to increase stability and selectivity of the hybrid, 

30 and thereby improve the quality and degree of specific hybrid molecules obtained. 
One will generally prefer to design nucleic acid molecules having gene- 
complementary stretches of 25 to 40 nucleotides, 55 to 70 nucleotides, or even 
longer where desired. Such fragments can be readily prepared by, for example, 
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directly synthesizing the fragment by chemical means, by application of nucleic acid 
reproduction technology, such as the PCR technology of (U.S. Patent 4,683,202, 
incorporated hereinafter by reference) or by excising selected DNA fragments from 
recombinant plasmids containing appropriate inserts and suitable restriction enzyme 
5 sites. 

In another aspect, the present invention contemplates an isolated and purified 
polynucleotide comprising a nucleotide sequence that is identical or complementary 
to a segment of at least 10 contiguous bases of one of SEQ ID NO:1 through SEQ ID 
NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, wherein the polynucleotide 

10 hybridizes to a polynucleotide that encodes a Streptococcus pneumoniae 
polypeptide. Preferably, the isolated and purified polynucleotide comprises a base 
sequence that is identical or complementary to a segment of at least 25 to about 70 
contiguous bases of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 
431 through SEQ ID NO: 591 . For example, the polynucleotide of the invention can 

15 comprise a segment of bases identical or complementary to 40 or 55 contiguous 
bases of the disclosed nucleotide sequences. 

Accordingly, a polynucleotide probe molecule of the invention can be used for 
its ability to selectively form duplex molecules with complementary stretches of the 
gene. Depending on the application envisioned, one will desire to employ varying 

20 conditions of hybridization to achieve varying degree of selectivity of the probe 
toward the target sequence (see Table 15 below). For applications requiring a high 
degree of selectivity, one will typically desire to employ relatively stringent conditions 
to form the hybrids. Of course, for some applications, for example, where one 
desires to prepare mutants employing a mutant primer strand hybridized to an 

25 underlying template or where one seeks to isolate a Streptococcus pneumoniae 
homologous polypeptide coding sequence from other cells, functional equivalents, or 
the like, less stringent hybridization conditions are typically needed to allow formation 
of the heteroduplex (see Table 15). Cross-hybridizing species can thereby be readily 
identified as positively hybridizing signals with respect to control hybridizations. 

30 Thus, hybridization conditions are readily manipulated, and thus will generally be a 
method of choice depending on the desired results. 

Of course, for some applications, for example, where one desires to prepare 
mutants employing a mutant primer strand hybridized to an underlying template or 
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where one seeks to isolate a homologous polypeptide coding sequence from other 
cells, functional equivalents, or the like, less stringent hybridization conditions are 
typically needed to allow formation of the heteroduplex. Cross-hybridizing species 
are thereby readily identified as positively hybridizing signals with respect to control 
5 hybridizations. In any case, it is generally appreciated that conditions can be 
rendered more stringent by the addition of increasing amounts of formamide, which 
serves to destabilize the hybrid duplex in the same manner as increased 
temperature. Thus, hybridization conditions are readily manipulated, and thus will 
generally be a method of choice depending on the desired results. 

10 The present invention also includes polynucleotides capable of hybridizing 

under reduced stringency conditions, more preferably stringent conditions, and most 
preferably highly stringent conditions, to polynucleotides described hereinafter. 
Examples of stringency conditions are shown in the table below: highly stringent 
conditions are those that are at least as stringent as, for example, conditions A-F; 

15 stringent conditions are at least as stringent as, for example, conditions G-L; and 
reduced stringency conditions are at least as stringent as, for example, conditions M- 
R. 
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Table 15 
Stringency Conditions 



Stringency 
Condition 


Polynucleotide 
Hybrid 


Hybrid 
Length 
(bp)' 


Hybridization 
Temperature and 
Buffer" 


Wash 

Temperature 
and Buffer H 


A 


DNA-.DNA 


>50 


65°C; 1xSSC-or- 
42°C; 1xSSC, 50% 
formamide 


65°C; 
0.3xSSC 


B 


DNA:DNA 


<50 


T B ; 1xSSC 


T B ; 1xSSC 


C 


DNA:RNA 


>50 


67°C; 1xSSC -or- 
45°C; 1xSSC, 50% 
formamide 


67°C; 
0.3xSSC 


D 


DNA:RNA 


<50 


T D ; 1xSSC 


T D ; 1xSSC 


E 


RNA:RNA 


>50 


70°C; 1xSSC -or- 
50°C; 1xSSC, 50% 
formamide 


70°C; 
0.3xSSC 


F 


RNA:RNA 


<50 


T F ; 1xSSC 


T F ; 1xSSC 


G 


DNA:DNA 


>50 


65°C; 4xSSC -or- 
42°C; 4XSSC, 50% 
formamide 


65°C; 1xSSC 


H 


DNA:DNA 


<50 


T H ; 4xSSC 


T H ; 4xSSC 


I 


DNA.RNA 


>50 


67°C; 4xSSC -or- 
45°C; 4xSSC, 50% 
formamide 


67°C; 1xSSC 


J 


DNARNA 


<50 


To; 4xSSC 


Tj; 4xSSC 


K 


RNA:RNA 


>50 


70°C; 4xSSC -or- 
50°C; 4xSSC, 50% 
formamide 


67°C; 1xSSC 


L 


RNA:RNA 


<50 


T L ; 2xSSC 


T L ; 2xSSC 


M 


DNA:DNA 


>50 


50°C; 4xSSC -or- 
40°C; 6xSSC, 50% 
formamide 


50°C; 2xSSC 


N 


DNA:DNA 


<50 


T N ; 6xSSC 


T N ; 6xSSC 
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0 


DNA:RNA 


>50 


55°C; 4xSSC -or- 
42°C; 6xSSC, 50% 
formamide 


55°C; 2xSSC 


p 


DNA.RNA 


<50 


T P ; 6xSSC 


T P ; 6xSSC 


Q 


RNA:RNA 


>50 


60°C; 4xSSC -or- 
45°C; 6xSSC, 50% 
formamide 


60°C; 2xSSC 


R 


RNA.RNA 


<50 


T R ; 4xSSC 


T R ; 4xSSC 



(bp) 1 : The hybrid length is that anticipated for the hybridized region(s) of the 
hybridizing polynucleotides. When hybridizing a polynucleotide to a target 
polynucleotide of unknown sequence, the hybrid length is assumed to be that of the 
5 hybridizing polynucleotide. When polynucleotides of known sequence are 
hybridized, the hybrid length can be determined by aligning the sequences of the 
polynucleotides and identifying the region or regions of optimal sequence 
complementarity. 

Buffer": SSPE (IxSSPE is 0.1 5M NaCI, 10mM NaH 2 P0 4 , and 1.25mM EDTA, 
10 pH 7.4) can be substituted for SSC (1xSSC is 0.1 5M NaCI and 1 5mM sodium citrate) 
in the hybridization and wash buffers; washes are performed for 15 minutes after 
hybridization is complete. 

T B through T R : The hybridization temperature for hybrids anticipated to be 
less than 50 base pairs in length should be 5-1 0°C less than the melting temperature 
15 (T m ) of the hybrid, where T m is determined according to the following equations. For 
hybrids less than 18 base pairs in length, T m (°C) = 2(# of A + T bases) + 4(# of G + C 
bases). For hybrids between 18 and 49 base pairs in length, T m (°C) = 81.5 + 
16.6(logio[Na + ]) + 0.41 (%G+C) - (600/N), where N is the number of bases in the 
hybrid, and [Na + ] is the concentration of sodium ions in the hybridization buffer ([Na + ] 
20 for 1xSSC = 0.165 M). 



Additional examples of stringency conditions for polynucleotide hybridization 
25 are provided in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, chapters 9 and 11, 
and Ausubel et al., 1995, Current Protocols in Molecular Biology, eds., John Wiley & 
Sons, Inc., sections 2.10 and 6.3-6.4, incorporated hereinafter by reference. 

In addition to the nucleic acid molecules encoding Sfrepfococcus pneumoniae 
30 polypeptides described above, another aspect of the invention pertains to isolated 
nucleic acid molecules which are antisense thereto. An "antisense" nucleic acid 
comprises a nucleotide sequence which is complementary to a "sense" nucleic acid 
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encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. Accordingly, an antisense 
nucleic acid can hydrogen bond to a sense nucleic acid. The antisense nucleic acid 
can be complementary to an entire Streptococcus pneumoniae coding strand, or to 
5 only a fragment thereof. In one embodiment, an antisense nucleic acid molecule is 
antisense to a "coding region" of the coding strand of a nucleotide sequence 
encoding a Streptococcus pneumoniae polypeptide. 

The term "coding region" refers to the region of the nucleotide sequence 
comprising codons which are translated into amino acid residues, e.g., the entire 

10 coding region of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 431 
through SEQ ID NO: 591. In another embodiment, the antisense nucleic acid 
molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 
sequence encoding a Streptococcus pneumoniae polypeptide. The term "noncoding 
region" refers to 5' and 3' sequences that flank the coding region that are not 

15 translated into amino acids (i.e., also referred to as 5" and 3' untranslated regions). 

Given the coding strand sequence encoding the Streptococcus pneumoniae 
polypeptide disclosed hereinafter (e.g., one of SEQ ID NO:1 through SEQ ID NO:215 
or SEQ ID NO: 431 through SEQ ID NO: 571), antisense nucleic acids of the 
invention can be designed according to the rules of Watson and Crick base pairing. 

20 The antisense nucleic acid molecule can be complementary to the entire coding 
region of Streptococcus pneumoniae mRNA, but more preferably is an 
oligonucleotide which is antisense to only a fragment of the coding or noncoding 
region of Streptococcus pneumoniae mRNA. For example, the antisense 
oligonucleotide can be complementary to the region surrounding the translation start 

25 site of Streptococcus pneumoniae mRNA. 

An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 
35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can 
be constructed using chemical synthesis and enzymatic ligation reactions using 
procedures known in the art. For example, an antisense nucleic acid (e.g., an 

30 antisense oligonucleotide) can be chemically synthesized using naturally occurring 
nucleotides or variously modified nucleotides designed to increase the biological 
stability of the molecules or to increase the physical stability of the duplex formed 
between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives 
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and acridine substituted nucleotides can be used. Examples of modified nucleotides 
which can be used to generate the antisense nucleic acid include 5-fluorouracil, 5- 
bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5- 
(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5- 
5 carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, 
N6-isopentenyladenine, l-methylguanine, l-methylinosine, 2,2-dimethylguanine, 2- 
methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7- 
methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2- 
10 methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4- 
thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid 
(v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6- 
diaminopurine. 

15 Alternatively, the antisense nucleic acid can be produced biologically using an 

expression vector into which a nucleic acid has been subcloned in an antisense 
orientation {i.e., RNA transcribed from the inserted nucleic acid will be of an 
antisense orientation to a target nucleic acid of interest, described further in the 
following subsection). 

20 The antisense nucleic acid molecules of the invention are typically 

administered to a subject or generated in situ such that they hybridize with or bind to 
cellular mRNA and/or genomic DNA encoding a Streptococcus pneumoniae 
polypeptide to thereby inhibit expression of the polypeptide, e.g., by inhibiting 
transcription and/or translation. The hybridization can be by conventional nucleotide 

25 complementarity to form a stable duplex, or, for example, in the case of an antisense 
nucleic acid molecule which binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of an 
antisense nucleic acid molecule of the invention includes direct injection at a tissue 
site. Alternatively, an antisense nucleic acid molecule can be modified to target 

30 selected cells and then administered systemically. For example, for systemic 
administration, an antisense molecule can be modified such that it specifically binds 
to a receptor or an antigen expressed on a selected cell surface, e.g., by linking the 
antisense nucleic acid molecule to a peptide or an antibody which binds to a cell 
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surface receptor or antigen. The antisense nucleic acid molecule can also be 
delivered to cells using the vectors described hereinafter. 

In yet another embodiment, the antisense nucleic acid molecule of the 
invention is an a-anomeric nucleic acid molecule. An cc-anomeric nucleic acid 
5 molecule forms specific double-stranded hybrids with complementary RNA in which, 
contrary to the usual y-units, the strands run parallel to each other (Gaultier et a/., 
1987). The antisense nucleic acid molecule can also comprise a 2'-o- 
methylribonucleotide (Inoue et al., 1987 (a)) or a chimeric RNA-DNA analogue (Inoue 
ef a/., 1987(b)). 

10 In still another embodiment, an antisense nucleic acid of the invention is a 

ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity which 
are capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which 
they have a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes 
(described in Haselhoff and Gerlach, 1988)) can be used to catalytically cleave 

15 Streptococcus pneumoniae mRNA transcripts to thereby inhibit translation of 
Streptococcus pneumoniae mRNA. A ribozyme having specificity for a 
Streptococcus pneumoniae-encodmg nucleic acid can be designed based upon the 
nucleotide sequence of a Sfreptococcus pneumoniae cDNA disclosed hereinafter 
(i.e., SEQ ID NO:l through SEQ ID NO:215 or SEQ ID NO: 431 through SEQ ID NO: 

20 591). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed 
in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a Streptococcus pneumoniae-encotimg mRNA. 
See, e.g., Cech ef al. U.S. Patent 4,987,071 and Cech et al. U.S. Patent 5,116,742 
both incorporated by reference. Alternatively, Streptococcus pneumoniae mRNA can 

25 be used to select a catalytic RNA having a specific ribonuclease activity from a pool 
of RNA molecules. See, e.g., Bartel and Szostak, 1993. 

Alternatively Streptococcus pneumoniae gene expression can be inhibited by 
targeting nucleotide sequences complementary to the regulatory region of the 
Streptococcus pneumoniae gene (e.g. , the Streptococcus pneumoniae gene 

30 promoter and/or enhancers) to form triple helical structures that prevent transcription 
of the Streptococcus pneumoniae gene in target cells. See generally, Helene, 1991; 
Helene etal., 1992; and Maher, 1992. 
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Streptococcus pneumoniae gene expression can also be inhibited using RNA 
interference (RNAi). This is a technique for post-transcriptional gene silencing 
(PTGS), in which target gene activity is specifically abolished with cognate double- 
stranded RNA (dsRNA). RNAi resembles in many aspects PTGS in plants and has 
5 been detected in many invertebrates including trypanosome, hydra, planaria, 
nematode and fruit fly (Drosophila melangnoster). It may be involved in the 
modulation of transposable element mobilization and antiviral state formation . RNAi 
in mammalian systems is disclosed in International Application WO 00/63364 which 
is incorporated by reference hereinafter in its entirety. Basically, dsRNA of at least 
10 about 600 nucleotides, homologous to the target is introduced into the cell and a 
sequence specific reduction in gene activity is observed. 

C. Streptococcus pneumoniae Polypeptides 

In particular embodiments, the present invention provides isolated and 

15 purified Sfrepfococcus pneumoniae polypeptides. Preferably, a Streptococcus 
pneumoniae polypeptide of the invention is a recombinant polypeptide. In certain 
embodiments, a Sfrepfococcus pneumoniae polypeptide of the present invention 
comprises the amino acid sequence that has at least 95% identity to the amino acid 
sequence of one of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 

20 through SEQ ID NO: 752 a biological equivalent thereof, or a fragment thereof. 

A Streptococcus pneumoniae polypeptide according to the present invention 
encompasses a polypeptide that comprises: 1) the amino acid sequence shown in 
one of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 or SEQ ID NO: 
752; 2) functional and non-functional naturally occurring variants or biological 

25 equivalents of Streptococcus pneumoniae polypeptides of SEQ ID NO:216 through 
SEQ ID NO:430 or SEQ ID NO: 592 through 752; 3) recombinantly produced variants 
or biological equivalents of Streptococcus pneumoniae polypeptides of SEQ ID 
NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752; and 
4) polypeptides isolated from organisms other than Streptococcus pneumoniae 

30 (orthologues of Streptococcus pneumoniae polypeptides. ) 

A biological equivalent or variant of a Streptococcus pneumoniae polypeptide 
according to the present invention encompasses 1) a polypeptide isolated from 
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Streptococcus pneumoniae; and 2) a polypeptide that contains substantially 
homology to a Streptococcus pneumoniae polypeptide. 

Biological equivalents or variants of Streptococcus pneumoniae include both 
functional and non-functional Streptococcus pneumoniae polypeptides. Functional 
5 biological equivalents or variants are naturally occurring amino acid sequence 
variants of a Streptococcus pneumoniae polypeptide that maintains the ability to elicit 
an immunological or antigenic response in a subject. Functional variants will typically 
contain only conservative substitution of one or more amino acids of one of SEQ ID 
N0.216 through SEQ ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752, or 
10 substitution, deletion or insertion of non-critical residues in non-critical regions of the 
polypeptide (e.g., not in regions containing antigenic determinants or protective 
epitopes). 

The present invention further provides non-Streptococcus pneumoniae 
orthologues of Streptococcus pneumoniae polypeptides. Orthologues of 

15 Streptococcus pneumoniae polypeptides are polypeptides that are isolated from non- 
Streptococcus pneumoniae organisms and possess antigenic capabilities of the 
Streptococcus pneumoniae polypeptide. Orthologues of a Streptococcus 
pneumoniae polypeptide can readily be identified as comprising an amino acid 
sequence that is substantially homologous to one of SEQ ID NO:216 through SEQ ID 

20 NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752. 

Modifications and changes can be made in the structure of a polypeptide of 
the present invention and still obtain a molecule having Streptococcus pneumoniae 
antigenicity. For example, certain amino acids can be substituted for other amino 
acids in a sequence without appreciable loss of antigenicity. Because it is the 

25 interactive capacity and nature of a polypeptide that defines that polypeptide's 
biological functional activity, certain amino acid sequence substitutions can be made 
in a polypeptide sequence (or, of course, its underlying DNA coding sequence) and 
nevertheless obtain a polypeptide with like properties. 

In making such changes, the hydropathic index of amino acids can be 

30 considered. The importance of the hydropathic amino acid index in conferring 
interactive biologic function on a polypeptide is generally understood in the art (Kyte 
& Doolittle, 1982). It is known that certain amino acids can be substituted for other 
amino acids having a similar hydropathic index or score and still result in a 
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polypeptide with similar biological activity. Each amino acid has been assigned a 
hydropathic index on the basis of its hydrophobicity and charge characteristics. 
Those indices are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine 
(+2.8); cysteine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); 
5 threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); 
histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (-3.5); 
lysine (-3.9); and arginine (-4.5). 

It is believed that the relative hydropathic character of the amino acid residue 
determines the secondary and tertiary structure of the resultant polypeptide, which in 

10 turn defines the interaction of the polypeptide with other molecules, such as 
enzymes, substrates, receptors, antibodies, antigens, and the like. It is known in the 
art that an amino acid can be substituted by another amino acid having a similar 
hydropathic index and still obtain a functionally equivalent polypeptide. In such 
changes, the substitution of amino acids whose hydropathic indices are within +1-2 is 

15 preferred, those that are within +/-1 are particularly preferred, and those within +/-0.5 
are even more particularly preferred. 

Substitution of like amino acids can also be made on the basis of 
hydrophilicity, particularly where the biological functional equivalent polypeptide or 
peptide thereby created is intended for use in immunological embodiments. U.S. 

20 Patent 4,554,1 01 , incorporated hereinafter by reference, states that the greatest local 
average hydrophilicity of a polypeptide, as governed by the hydrophilicity of its 
adjacent amino acids, correlates with its immunogenicity and antigenicity, i.e. with a 
biological property of the polypeptide. 

As detailed in U.S. Patent 4,554,101, the following hydrophilicity values have 

25 been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 
±1); glutamate (+3.0 ±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine 
(0); proline (-0.5 ±1); threonine (-0.4); alanine (-0.5); histidine (-0.5); cysteine (-1.0); 
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); 
phenylalanine (-2.5); tryptophan (-3.4). It is understood that an amino acid can be 

30 substituted for another having a similar hydrophilicity value and still obtain a 
biologically equivalent, and in particular, an immunologically equivalent polypeptide. 
In such changes, the substitution of amino acids whose hydrophilicity values are 
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within ±2 is preferred, those that are within ±1 are particularly preferred, and those 
within +0.5 are even more particularly preferred. 

As outlined above, amino acid substitutions are generally therefore based on 
the relative similarity of the amino acid side-chain substituents, for example, their 
5 hydrophobicity, hydrophilicity, charge, size, and the like. Exemplary substitutions 
which take various of the foregoing characteristics into consideration are well known 
to those of skill in the art and include: arginine and lysine; glutamate and aspartate; 
serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine 
(See Table 16, below). The present invention thus contemplates functional or 
10 biological equivalents of a Streptococcus pneumoniae polypeptide as set forth above. 



TABLE 16 
Amino Acid Substitutions 



Original 
Residue 


Exemplary Residue 
Substitution 


Ala 


Gly; Ser 


Arg 


Lys 


Asn 


Gin; His 


Asp 


Glu 


Cys 


Ser 


Gin 


Asn 


Glu 


Asp 


Gly 


Ala 


His 


Asn; Gin 


He 


Leu; Val 


Leu 


He; Val 


Lys 


Arg 


Met 


Leu; Tyr 


Ser 


Thr 


Thr 


Ser 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


He; Leu 



15 ~~' ~ ~ ' 

Biological or functional equivalents of a polypeptide can also be prepared 
using site-specific mutagenesis. Site-specific mutagenesis is a technique useful in 
the preparation of second generation polypeptides, or biologically functional 
20 equivalent polypeptides or peptides, derived from the sequences thereof, through 
specific mutagenesis of the underlying DNA. As noted above, such changes can be 
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desirable where amino acid substitutions are desirable. The technique further 
provides a ready ability to prepare and test sequence variants, for example, 
incorporating one or more of the foregoing considerations, by introducing one or 
more nucleotide sequence changes into the DNA. Site-specific mutagenesis allows 
5 the production of mutants through the use of specific oligonucleotide sequences 
which encode the DNA sequence of the desired mutation, as well as a sufficient 
number of adjacent nucleotides, to provide a primer sequence of sufficient size and 
sequence complexity to form a stable duplex on both sides of the deletion junction 
being traversed. Typically, a primer of about 17 to 25 nucleotides in length is 
10 preferred, with about 5 to 10 residues on both sides of the junction of the sequence 
being altered. 

In general, the technique of site-specific mutagenesis is well known in the art. 
As will be appreciated, the technique typically employs a phage vector which can 
exist in both a single stranded and double stranded form. Typically, site-directed 

15 mutagenesis in accordance herewith is performed by first obtaining a single-stranded 
vector which includes within its sequence a DNA sequence which encodes all or a 
portion of the Streptococcus pneumoniae polypeptide sequence selected. An 
oligonucleotide primer bearing the desired mutated sequence is prepared (e.g., 
synthetically). This primer is then annealed to the singled-stranded vector, and 

20 extended by the use of enzymes such as E. coli polymerase I Klenow fragment, in 
order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex 
is formed wherein one strand encodes the original non-mutated sequence and the 
second strand bears the desired mutation. T his heteroduplex vector is then used to 
transform appropriate cells such as E. coli cells and clones are selected which 

25 include recombinant vectors bearing the mutation. Commercially available kits come 
with all the reagents necessary, except the oligonucleotide primers. 

A Streptococcus pneumoniae polypeptide or polypeptide antigen of the 
present invention is understood to be any Sfreptococcus pneumoniae polypeptide 
comprising substantial sequence similarity, structural similarity and/or functional 

30 similarity to a Streptococcus pneumoniae polypeptide comprising the amino acid 
sequence of one of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 
through SEQ ID NO: 752. In addition, a Streptococcus pneumoniae polypeptide or 
polypeptide antigen of the invention is not limited to a particular source. Thus, the 
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invention provides for the general detection and isolation of the polypeptides from a 
variety of sources. 

It is contemplated in the present invention, that a Streptococcus pneumoniae 
polypeptide may advantageously be cleaved into fragments for use in further 
5 structural or functional analysis, or in the generation of reagents such as 
Streptococcus pneumoniae-re\ated polypeptides and Streptococcus pneumoniae- 
specific antibodies. This can be accomplished by treating purified or unpurified 
Streptococcus pneumoniae polypeptides with a peptidase such as endoproteinase 
glu-C (Boehringer, Indianapolis, IN). Treatment with CNBr is another method by 

10 which peptide fragments may be produced from natural Streptococcus pneumoniae 
polypeptides. Recombinant techniques also can be used to produce specific 
fragments of a Streptococcus pneumoniae polypeptide. 

In addition, the inventors also contemplate that compounds sterically similar 
to a particular Streptococcus pneumoniae polypeptide antigen may be formulated to 

15 mimic the key portions of the peptide structure, called peptidomimetics. Mimetics are 
peptide-containing molecules which mimic elements of protein secondary structure, 
(see, e.g. Johnson et al., 1993). The underlying rationale behind the use of peptide 
mimetics is that the peptide backbone of proteins exists chiefly to orient amino acid 
side chains in such a way as to facilitate molecular interactions, such as those of 

20 receptor and ligand. 

Successful applications of the peptide mimetic concept have thus far focused 
on mimetics of p-turns within proteins. Likely (3-turn structures within Streptococcus 
pneumoniae can be predicted by computer-based algorithms as discussed above. 
Once the component amino acids of the turn are determined, mimetics can be 

25 constructed to achieve a similar spatial orientation of the essential elements of the 
amino acid side chains, as discussed in Johnson et al., 1993. 

Fragments of the Streptococcus pneumoniae polypeptides are also included 
in the invention. A fragment is a polypeptide having an amino acid sequence that 
entirely is the same as part, but not all, of the amino acid sequence. The fragment 

30 can comprise, for example, at least 7 or more {e.g., 8, 10, 12, 14, 16, 18, 20, or 
more) contiguous amino acids of an amino acid sequence of one of SEQ ID NO: 216 
through SEQ ID NO: 430 or SEQ ID NO:592 through SEQ ID NO: 752. Fragments 
may be "freestanding" or comprised within a larger polypeptide of which they form a 
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part or region, most preferably as a single, continuous region. In one embodiment, 
the fragments include at least one epitope of the mature polypeptide sequence. 

"Fusion protein" refers to a protein or polypeptide encoded by two, often 
unrelated, fused genes or fragments thereof. For example, fusion proteins or 
5 polypeptides comprising various portions of constant region of immunoglobulin 
molecules together with another human protein or part thereof have been described. 
In many cases, employing an immunoglobulin Fc region as a part of a fusion protein 
or polypeptide is advantageous for use in therapy and diagnosis resulting in, for 
example, improved pharmacokinetic properties (see e.g., International Application 
10 EP-A 0232 2621). On the other hand, for some uses it would be desirable to be able 
to delete the Fc part after the fusion protein or polypeptide has been expressed, 
detected and purified. 

D. Streptococcus pneumoniae Polynucleotide and Polypeptide Variants 

15 "Variant" as the term is used hereinafter, is a polynucleotide or polypeptide 

that differs from a reference polynucleotide or polypeptide respectively, but retains 
essential properties. A typical variant of a polynucleotide differs in nucleotide 
sequence from another, reference polynucleotide. Changes in the nucleotide 
sequence of the variant may or may not alter the amino acid sequence of a 

20 polypeptide encoded by the reference polynucleotide. Nucleotide changes may 
result in amino acid substitutions, additions, deletions, fusions and truncations in the 
polypeptide encoded by the reference sequence, as discussed below. A typical 
variant of a polypeptide differs in amino acid sequence from another, reference 
polypeptide. Generally, differences are limited so that the sequences of the 

25 reference polypeptide and the variant are closely similar overall and, in many 
regions, identical. A variant and reference polypeptide may differ in amino acid 
sequence by one or more substitutions, additions, deletions in any combination. A 
substituted or inserted amino acid residue may or may not be one encoded by the 
genetic code. A variant of a polynucleotide or polypeptide may be a naturally 

30 occurring such as an allelic variant, or it may be a variant that is not known to occur 
naturally. Non-naturally occurring variants of polynucleotides and polypeptides may 
be made by mutagenesis techniques or by direct synthesis. 



-70- 



WO 02/083855 



PCT/US02/11524 



"Identity," as known in the art, is a relationship between two or more 
polypeptide sequences or two or more polynucleotide sequences, as determined by 
comparing the sequences. In the art, "identity" also means the degree of sequence 
relatedness between polypeptide or polynucleotide sequences, as the case may be, 
5 as determined by the match between strings of such sequences. "Identity" and 
"similarity" can be readily calculated by known methods, including but not limited to 
those described in (Computational Molecular Biology, Lesk, A. M., ed., Oxford 
University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, 
Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of 
10 Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New 
Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic 
Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M 
Stockton Press, New York, 1991; and Carillo, H., and Lipman, D., SIAM J. Applied 
Math., 48: 1073 (1988). Preferred methods to determine identity are designed to 
give the largest match between the sequences tested. Methods to determine identity 
and similarity are codified in publicly available computer programs. Preferred 
computer program methods to determine identity and similarity between two 
sequences include, but are not limited to, the GCG program package (Devereux, J., 
et al 1984), BLASTP, BLASTN, TBLASTN and FASTA (Altschul, S. F., et al., 1990). 
The BLASTX program is publicly available from NCBI and other sources (BLAST 
Manual, Altschul, S., etal., NCBI NLM NIH Bethesda, Md. 20894; Altschul, S., et al., 
1990). The well known Smith-Waterman algorithm may also be used to determine 
identity. 

By way of example, a polynucleotide sequence of the present invention may 
be identical to the reference sequence of one of SEQ ID NO:1 through SEQ ID 
NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, that is be 100% identical, or it 
may include up to a certain integer number of nucleotide alterations as compared to 
the reference sequence. Such alterations are selected from the group consisting of 
at least one nucleotide deletion, substitution, including transition and transversion, or 
insertion, and wherein said alterations may occur at the 5' or 3' terminal positions of 
the reference nucleotide sequence or anywhere between those terminal positions, 
interspersed either individually among the nucleotides in the reference sequence or 
in one or more contiguous groups within the reference sequence. The number of 
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nucleotide alterations is determined by multiplying the total number of nucleotides in 
one of SEQ ID NO.i through SEQ ID NO:215 or SEQ ID NO: 431 through SEQ ID 
NO: 591 by the numerical percent of the respective percent identity (divided by 100) 
and subtracting that product from said total number of nucleotides in one of SEQ ID 
5 NO:1 through SEQ ID NO:215orSEQ ID NO: 431 through SEQ ID NO: 591. 

For example, an isolated Streptococcus pneumoniae polynucleotide 
comprising a polynucleotide sequence that has at least 70% identity to the nucleic 
acid sequence of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 431 
through SEQ ID NO: 591; a degenerate variant thereof or a fragment thereof, 
10 wherein the polynucleotide sequence may include up to n„ nucleic acid alterations 
over the entire polynucleotide region of the nucleic acid sequence of one of SEQ ID 
NO:1 through SEQ ID NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, wherein 
n„ is the maximum number of alterations and is calculated by the formula: 
n„ < x„-(x n 'y), 

15 in which x„ is the total number of nucleic acids of one of SEQ ID NO:1 through SEQ 
ID NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591 and y has a value of 0.70, 
wherein any non-integer product of x„ and y is rounded down to the nearest integer 
prior to subtracting such product from x„. Of course, y may also have a value of 0.80 
for 80%, 0.85 for 85%, 0.90 for 90% 0.95 for 95%, etc. Alterations of a 

20 polynucleotide sequence encoding one of the polypeptides of SEQ ID NO:216 
through SEQ ID NO.430 or SEQ ID NO: 592 through SEQ ID NO: 752 may create 
nonsense, missense or frameshift mutations in this coding sequence and thereby 
alter the polypeptide encoded by the polynucleotide following such alterations. 

Similarly, a polypeptide sequence of the present invention may be identical to 

25 the reference sequence of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 
592 through SEQ ID NO: 752, that is be 100% identical, or it may include up to a 
certain integer number of amino acid alterations as compared to the reference 
sequence such that the % identity is less than 100%. Such alterations are selected 
from the group consisting of at least one amino acid deletion, substitution, including 

30 conservative and non-conservative substitution, or insertion, and wherein said 
alterations may occur at the amino- or carboxy-terminal positions of the reference 
polypeptide sequence or anywhere between those terminal positions, interspersed 
either individually among the amino acids in the reference sequence or in one or 
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more contiguous groups within the reference sequence. The number of amino acid 
alterations for a given % identity is determined by multiplying the total number of 
amino acids in one of SEQ ID NO:216 through SEQ ID NO.430 or SEQ ID NO: 592 
through SEQ ID NO: 752 by the numerical percent of the respective percent identity 
5 (divided by 100) and then subtracting that product from said total number of amino 
acids in one of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 through 
SEQ ID NO: 752, or: 

n a < x a -(x a «y), 

10 wherein n a is the number of amino acid alterations, x a is the total number of amino 
acids in one of SEQ ID NO:216 through SEQ ID NO:430 SEQ ID NO: 592 through 
SEQ ID NO: 752, and y is, for instance 0.70 for 70%, 0.80 for 80%, 0.85 for 85% etc., 
and wherein any non-integer product of x a and y is rounded down to the nearest 
integer prior to subtracting it from x a . 

15 

E. Vectors, Host Cells and Recombinant Streptococcus pneumoniae 
Polypeptides 

In a preferred embodiment, the present invention provides expression vectors 
comprising ORF polynucleotides that encode Streptococcus pneumoniae 

20 polypeptides. Preferably, the expression vectors of the present invention comprise 
ORF polynucleotides that encode Streptococcus pneumoniae polypeptides 
comprising the amino acid residue sequence of one of SEQ ID NO:216 through SEQ 
ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752. More preferably, the 
expression vectors of the present invention comprise a polynucleotide comprising the 

25 nucleotide base sequence of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ 
ID NO: 431 through SEQ ID NO: 591. Even more preferably, the expression vectors 
of the invention comprise a polynucleotide operatively linked to an enhancer- 
promoter. More preferably still, the expression vectors of the invention comprise 
polynucleotide operatively linked to a prokaryotic promoter. Alternatively, the 

30 expression vectors of the present invention comprise polynucleotide operatively 
linked to an enhancer-promoter that is a eukaryotic promoter, and the expression 
vectors further comprise a polyadenylation signal that is positioned 3' of the carboxy- 
terminal amino acid and within a transcriptional unit of the encoded polypeptide. 
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Expression of proteins in prokaryotes is most often carried out in E. coli with 
vectors containing constitutive or inducible promoters directing the expression of 
either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to 
a protein encoded therein, usually to the amino terminus of the recombinant protein. 
5 Such fusion vectors typically serve three purposes: 1) to increase expression of 
recombinant protein; 2) to increase the solubility of the recombinant protein; and 3) to 
aid in the purification of the recombinant protein by acting as a ligand in affinity 
purification. Often, in fusion expression vectors, a proteolytic cleavage site is 
introduced at the junction of the fusion moiety and the recombinant protein to enable 

10 separation of the recombinant protein from the fusion moiety subsequent to 
purification of the fusion protein. Such enzymes, and their cognate recognition 
sequences, include Factor Xa, thrombin and enterokinase. 

Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; 
Smith and Johnson,1988), pMAL (New England Biolabs, Beverly; MA) and pRIT5 

15 (Pharmacia, Piscataway, NJ) which fuse glutathione S- transferase (GST), maltose E 
binding protein, or protein A, respectively, to the target recombinant protein. 

In one embodiment, the coding sequence of the Streptococcus pneumoniae 
polynucleotide is cloned into a pGEX expression vector to create a vector encoding a 
fusion protein comprising, from the N-terminus to the C-terminus, GST-thrombin 

20 cleavage site-Sfrepfococa/s pneumoniae polypeptide. The fusion protein can be 
purified by affinity chromatography using glutathione-agarose resin. Recombinant 
Streptococcus pneumoniae polypeptide unfused to GST can be recovered by 
cleavage of the fusion protein with thrombin. 

Examples of suitable inducible non-fusion E. coli expression vectors include 

25 pTrc (Amann et a/., 1988), pET lid (Studier et ah, 1990), pBAD and pCRT7. Target 
gene expression from the pTrc vector relies on host RNA polymerase transcription 
from a hybrid trp-lac fusion promoter. Target gene expression from the pET lid 
vector relies on transcription from a T7 gn1 0-lac fusion promoter mediated by a 
coexpressed viral RNA polymerase J7 gnl. This viral polymerase is supplied by host 

30 strains BL21 (DE3) or HMS I 74(DE3) from a resident prophage harboring a T7 gnl 
gene under the transcriptional control of the lacUV 5 promoter. 

One strategy to maximize recombinant protein expression in E. coli is to 
express the protein in a host bacterium with an impaired capacity to proteolytically 
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cleave the recombinant protein. Another strategy is to alter the nucleic acid 
sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in E. coli. Such 
alteration of nucleic acid sequences of the invention can be carried out by standard 
5 DNA mutagenesis or synthesis techniques. 

In another embodiment, the Streptococcus pneumoniae polynucleotide 
expression vector is a yeast expression vector. Examples of vectors for expression 
in yeast S. cerivisae include pYepSec I (Baldari, et a/., 1987), pMFa (Kurjan and 
Herskowitz, 1982), pJRY88 (Schultz et a/., 1987), and pYES2 (Invitrogen 

10 Corporation, San Diego, CA). 

Alternatively, a Streptococcus pneumoniae polynucleotide can be expressed 
in insect cells using, for example, baculovirus expression vectors. Baculovirus 
vectors available for expression of proteins in cultured insect cells (e.g., Sf 9 cells) 
include the pAc series (Smith et ah, 1983) and the pVL series (Lucklow and 

15 Summers, 1989). 

In yet another embodiment, a nucleic acid of the invention is expressed in 
mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, 1987) and pMT2PC (Kaufman et at., 
1987). When used in mammalian cells, the expression vector's control functions are 

20 often provided by viral regulatory elements. 

As used hereinafter, a promoter is a region of a DNA molecule typically within 
about 100 nucleotide pairs in front of (upstream of) the point at which transcription 
begins (i.e., a transcription start site). That region typically contains several types of 
DNA sequence elements that are located in similar relative positions in different 

25 genes. As used hereinafter, the term "promoter" includes what is referred to in the 
art as an upstream promoter region, a promoter region or a promoter of a 
generalized eukaryotic RNA Polymerase II transcription unit. 

Another type of discrete transcription regulatory sequence element is an 
enhancer. An enhancer provides specificity of time, location and expression level for 

30 a particular encoding region (e.g., gene). A major function of an enhancer is to 
increase the level of transcription of a coding sequence in a cell that contains one or 
more transcription factors that bind to that enhancer. Unlike a promoter, an enhancer 
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can function when located at variable distances from transcription start sites so long 
as a promoter is present. 

As used hereinafter, the phrase "enhancer-promoter" means a composite unit 
that contains both enhancer and promoter elements. An enhancer-promoter is 
5 operatively linked to a coding sequence that encodes at least one gene product. As 
used hereinafter, the phrase "operatively linked" means that an enhancer-promoter is 
connected to a coding sequence in such a way that the transcription of that coding 
sequence is controlled and regulated by that enhancer-promoter. Means for 
operatively linking an enhancer-promoter to a coding sequence are well known in the 

10 art. As is also well known in the art, the precise orientation and location relative to a 
coding sequence whose transcription is controlled, is dependent inter alia upon the 
specific nature of the enhancer-promoter. Thus, a TATA box minimal promoter is 
typically located from about 25 to about 30 base pairs upstream of a transcription 
initiation site and an upstream promoter element is typically located from about 100 

15 to about 200 base pairs upstream of a transcription initiation site. In contrast, an 
enhancer can be located downstream from the initiation site and can be at a 
considerable distance from that site. 

An enhancer-promoter used in a vector construct of the present invention can 
be any enhancer-promoter that drives expression in a cell to be transfected. By 

20 employing an enhancer-promoter with well-known properties, the level and pattern of 
gene product expression can be optimized. 

For example, commonly used promoters are derived from polyoma, 
Adenovirus 2, cytomegalovirus and Simian Virus 40. For other suitable expression 
systems for both prokaryotic and eukaryotic cells see chapters 16 and 17 of 

25 Sambrook et al., "Molecular Cloning: A Laboratory Manual" 2nd, ed, Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
1989, incorporated hereinafter by reference. 

In another embodiment, the recombinant mammalian expression vector is 
capable of directing expression of the nucleic acid preferentially in a particular cell 

30 type (e.g., tissue-specific regulatory elements are used to express the nucleic acid). 
Tissue-specific regulatory elements are known in the art. Non-limiting examples of 
suitable tissue-specific promoters include the albumin promoter (liver-specific; Pinkert 
et al., 1987), lymphoid-specific promoters (Calame and Eaton, 1988), in particular, 
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promoters of T cell receptors (Winoto and Baltimore, 1989) and immunoglobulins 
(Banerji et al., 1983), Queen and Baltimore (1983), neuron-specific promoters (e.g., 
the neurofilament promoter; Byrne and Ruddle, 1989), pancreas-specific promoters 
(Edlund et al., 1985), and mammary gland-specific promoters (e.g., milk whey 
5 promoter; U.S. Patent 4,873,316 and International Application EP 264,166). 
Developmentally-regulated promoters are also encompassed, for example the 
murine hox promoters (Kessel and Gruss, 1990) and the a-fetoprotein promoter 
(Campes and Tilghman, 1989). 

The invention further provides a recombinant expression vector comprising a 

10 DNA molecule encoding a Streptococcus pneumoniae polypeptide cloned into the 
expression vector in an antisense orientation. That is, the DNA molecule is 
operatively linked to a regulatory sequence in a manner which allows for expression 
(by transcription of the DNA molecule) of an RNA molecule which is antisense to 
Streptococcus pneumoniae mRNA. Regulatory sequences operatively linked to a 

15 nucleic acid cloned in the antisense orientation can be chosen which direct the 
continuous expression of the antisense RNA molecule in a variety of cell types. For 
instance viral promoters and/or enhancers, or regulatory sequences can be chosen 
which direct constitutive, tissue specific or cell type specific expression of antisense 
RNA. The antisense expression vector can be in the form of a recombinant plasmid, 

20 phagemid or attenuated virus in which antisense nucleic acids are produced under 
the control of a high efficiency regulatory region, the activity of which can be 
determined by the cell type into which the vector is introduced. 

Another aspect of the invention pertains to host cells into which a 
recombinant expression vector of the invention has been introduced. The terms 

25 "host cell" and "recombinant host cell" are used interchangeably hereinafter. It is 
understood that such terms refer not only to the particular subject cell, but to the 
progeny or potential progeny of such a cell. Because certain modifications may 
occur in succeeding generations due to either mutation or environmental influences, 
such progeny may not, in fact, be identical to the parent cell, but are still included 

30 within the scope of the term as used hereinafter. A host cell can be any prokaryotic 
or eukaryotic cell. For example, a Streptococcus pneumoniae polypeptide can be 
expressed in bacterial cells such as E. coli, insect cells (such as Sf9, Sf21 ), yeast or 
mammalian cells (such as Chinese hamster ovary cells (CHO), VERO, chick embryo 
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fibroblasts, BHK cells or COS cells). Other suitable host cells are known to those 
skilled in the art. 

Vector DNA is introduced into prokaryotic or eukaryotic cells via conventional 
transformation, infection or transfection techniques. As used hereinafter, the terms 
5 "transformation" and "transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including 
calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 
transfection, lipofection, ultrasound or electroporation. Suitable methods for 
transforming or transfecting host cells can be found in Sambrook, et al. ("Molecular 
10 Cloning: A Laboratory Manual" 2nd, ed, Cold Spring Harbor Laboratory, Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, NY, 1989), and other laboratory 
manuals. 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 
culture, can be used to produce (i.e., express) a Streptococcus pneumoniae 

15 polypeptide. Accordingly, the invention further provides methods for producing a 
Streptococcus pneumoniae polypeptide using the host cells of the invention. In one 
embodiment, the method comprises culturing the host cell of invention (into which a 
recombinant expression vector encoding a Streptococcus pneumoniae polypeptide 
has been introduced) in a suitable medium until the Streptococcus pneumoniae 

20 polypeptide is produced. In another embodiment, the method further comprises 
isolating the Streptococcus pneumoniae polypeptide from the medium or the host 
cell. 

A coding sequence of an expression vector is operatively linked to a 
transcription termination region. RNA polymerase transcribes an encoding DNA 

25 sequence through a site where polyadenylation occurs. Typically, DNA sequences 
located a few hundred base pairs downstream of the polyadenylation site serve to 
terminate transcription. Those DNA sequences are referred to hereinafter as 
transcription-termination regions. Those regions are required for efficient 
polyadenylation of transcribed messenger RNA (mRNA). Transcription-termination 

30 regions are well known in the art. A preferred transcription-termination region used in 
an adenovirus vector construct of the present invention comprises a polyadenylation 
signal of SV40 or the protamine gene. 
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An expression vector comprises a polynucleotide that encodes a 
Streptococcus pneumoniae polypeptide. Such a polypeptide is meant to include a 
sequence of nucleotide bases encoding a Streptococcus pneumoniae polypeptide 
sufficient in length to distinguish the segment from a polynucleotide segment 
5 encoding a non-Sfrepfococct/s pneumoniae polypeptide. A polypeptide of the 
invention can also encode biologically functional polypeptides or peptides which have 
variant amino acid sequences, such as with changes selected based on 
considerations such as the relative hydropathic score of the amino acids being 
exchanged. These variant sequences are those isolated from natural sources or 

10 induced in the sequences disclosed hereinafter using a mutagenic procedure such as 
site-directed mutagenesis. 

Preferably, the expression vectors of the present invention comprise 
polynucleotide that encode polypeptides comprising the amino acid residue 
sequence of one of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 

15 through SEQ ID NO: 752. An expression vector can include a Streptococcus 
pneumoniae polypeptide coding region itself of any of the Streptococcus pneumoniae 
polypeptides noted above or it can contain coding regions bearing selected 
alterations or modifications in the basic coding region of such a Streptococcus 
pneumoniae polypeptide. Alternatively, such vectors or fragments can code larger 

20 polypeptides or polypeptides which nevertheless include the basic coding region. In 
any event, it should be appreciated that due to codon redundancy as well as 
biological functional equivalence, this aspect of the invention is not limited to the 
particular DNA molecules corresponding to the polypeptide sequences noted above. 
Exemplary vectors include the mammalian expression vectors of the pCMV 

25 family including pCMV6b and pCMV6c (Chiron Corp., Emeryville CA.). In certain 
cases, and specifically in the case of these individual mammalian expression vectors, 
the resulting constructs can require co-transfection with a vector containing a 
selectable marker such as pSV2neo. Via co-transfection into a dihydrofolate 
reductase-deficient Chinese hamster ovary cell line, such as DG44, clones 

30 expressing Streptococcus pneumoniae polypeptides by virtue of DNA incorporated 
into such expression vectors can be detected. 

A DNA molecule of the present invention can be incorporated into a vector by 
a number of techniques that are well known in the art. For instance, the vector 
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pUC18 has been demonstrated to be of particular value in cloning and expression of 
genes. Likewise, the related vectors M13mp18 and M13mp19 can be used in certain 
embodiments of the invention, in particular, in performing dideoxy sequencing. 

An expression vector of the present invention is useful both as a means for 
5 preparing quantities of the Streptococcus pneumoniae polypeptide-encoding DNA 
itself, and as a means for preparing the encoded polypeptide and peptides. It is 
contemplated that where Streptococcus pneumoniae polypeptides of the invention 
are made by recombinant means, one can employ either prokaryotic or eukaryotic 
expression vectors as shuttle systems. 

10 In another aspect, the recombinant host cells of the present invention are 

prokaryotic host cells. Preferably, the recombinant host cells of the invention are 
bacterial cells of the DH5 a strain of Escherichia coli. In general, prokaryotes are 
preferred for the initial cloning of DNA sequences and constructing the vectors useful 
in the invention. For example, E. coli K12 strains can be particularly useful. Other 

15 microbial strains that can be used include E. coli B, and E co// x 1976 (ATCC No. 
31537). These examples are, of course, intended to be illustrative rather than 
limiting. 

The aforementioned strains, as well as £. coli W3110 (ATCC No. 273325), E. 

coli BL21(DE3), E. coli Top10, bacilli such as Bacillus subtilis, or other 
20 enterobacteriaceae such as Salmonella typhimurium (or other attenuated Salmonella 

strains as described in U.S. Patent 4,837,151) or Serratia marcesans, and various 

Pseudomonas species can be used. 

In general, plasmid vectors containing replicon and control sequences, which 

are derived from species compatible with the host cell are used in connection with 
25 these hosts. The vector ordinarily carries a replication site, as well as marking 

sequences which are capable of providing phenotypic selection in transformed cells. 

For example, E. coli can be transformed using pBR322, a plasmid derived from an E. 

coli species (Bolivar, ef a/. 1977). pBR322 contains genes for ampicillin and 

tetracycline resistance and thus provides easy means for identifying transformed 
30 cells. The pBR plasmid, or other microbial plasmid or phage must also contain, or be 

modified to contain, promoters which can be used by the microbial organism for 

expression of its own polypeptides. 
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Those promoters most commonly used in recombinant DNA construction 
include the B-lactamase (penicillinase) and lactose promoter systems (Chang, er al. 
1978; Itakura., et al. 1977, Goeddel, et al. 1979; Goeddel, et al. 1980) and a 
tryptophan (TRP) promoter system (EP 0036776; Siebwenlist et al. 1980). While 
5 these are the most commonly used, other microbial promoters have been discovered 
and utilized, and details concerning their nucleotide sequences have been published, 
enabling a skilled worker to introduce functional promoters into plasmid vectors 
(Siebwenlist, etal. 1980). 

In addition to prokaryotes, eukaryotic microbes such as yeast can also be 

10 used. Saccharomyces cerevisiase or common baker's yeast is the most commonly 
used among eukaryotic microorganisms, although a number of other strains are 
commonly available. For expression in Saccharomyces, the plasmid YRp7, for 
example, is commonly used (Stinchcomb, et al. 1979; Kingsman, et al. 1979; 
Tschemper, etal. 1980). This plasmid already contains the trpl gene which provides 

15 a selection marker for a mutant strain of yeast lacking the ability to grow in 
tryptophan, for example ATCC No. 44076 or PEP4-1 (Jones, 1 977). The presence of 
the trpl lesion as a characteristic of the yeast host cell genome then provides an 
effective environment for detecting transformation by growth in the absence of 
tryptophan. 

20 Suitable promoter sequences in yeast vectors include the promoters for 3- 

phosphoglycerate kinase (Hitzeman., et al. 1980) or other glycolytic enzymes (Hess, 
et al. 1968; Holland, et al. 1978) such as enolase, glyceraldehyde-3-phosphate 
dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose- 
6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, 

25 triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. In 
constructing suitable expression plasmids, the termination sequences associated 
with these genes are also introduced into the expression vector downstream from the 
sequences to be expressed to provide polyadenylation of the mRNA and termination. 
Other promoters, which have the additional advantage of transcription controlled by 

30 growth conditions are the promoter region for alcohol dehydrogenase 2, 
isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen 
metabolism, and the aforementioned glyceraldehyde-3-phosphate dehydrogenase, 
and enzymes responsible for maltose and galactose utilization. Any plasmid vector 
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containing a yeast-compatible promoter, origin or replication and termination 
sequences are suitable. 

In addition to microorganisms, cultures of cells derived from multicellular 
organisms can also be used as hosts. In principle, any such cell culture is workable, 
5 whether from vertebrate or invertebrate culture. However, interest has been greatest 
in vertebrate cells, and propagation of vertebrate cells in culture (tissue culture) has 
become a routine procedure in recent years. Examples of such useful host cell lines 
are AtT-20, VERO, HeLa, NSO, PER C6, Chinese hamster ovary (CHO) cell lines, 
and W138, BHK, COSM6, COS-7, 293 and MDCK cell lines. Expression vectors for 
10 such cells ordinarily include (if necessary) an origin of replication, a promoter located 
upstream of the gene to be expressed, along with any necessary ribosome binding 
sites, RNA splice sites, polyadenylation site, and transcriptional terminator 
sequences. 

Where expression of recombinant Streptococcus pneumoniae polypeptides is 

15 desired and a eukaryotic host is contemplated, it is most desirable to employ a vector 
such as a plasmid, that incorporates a eukaryotic origin of replication. Additionally, 
for the purposes of expression in eukaryotic systems, one desires to position the 
Streptococcus pneumoniae encoding sequence adjacent to and under the control of 
an effective eukaryotic promoter such as promoters used in combination with 

20 Chinese hamster ovary cells. To bring a coding sequence under control of a 
promoter, whether it is eukaryotic or prokaryotic, the 5' end of the translation initiation 
region of the proper translational reading frame of the polypeptide must be positioned 
between about 1 and about 50 nucleotides 3' of or downstream with respect to the 
promoter chosen. Furthermore, where eukaryotic expression is anticipated, one 

25 would typically desire to incorporate into the transcriptional unit which includes the 
Streptococcus pneumoniae polypeptide. 

Means of transforming or transfecting cells with exogenous polynucleotide 
such as DNA molecules are well known in the art and include techniques such as 
calcium-phosphate- or DEAE-dextran-mediated transfection, protoplast fusion, 

30 electroporation, liposome mediated transfection, direct microinjection and adenovirus 
infection (see e.g., Sambrook, Fritsch and Maniatis, 1989). 

The most widely used method is transfection mediated by either calcium 
phosphate or DEAE-dextran. Although the mechanism remains obscure, it is 
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believed that the transfected DNA enters the cytoplasm of the cell by endocytosis 
and is transported to the nucleus. Depending on the cell type, up to 90% of a 
population of cultured cells can be transfected at any one time. Because of its high 
efficiency, transfection mediated by calcium phosphate or DEAE-dextran is the 
5 method of choice for experiments that require transient expression of the foreign 
DNA in large numbers of cells. Calcium phosphate-mediated transfection is also 
used to establish cell lines that integrate copies of the foreign DNA, which are usually 
arranged in head-to-tail tandem arrays into the host cell genome. 

In the protoplast fusion method, protoplasts derived from bacteria carrying 

10 high numbers of copies of a plasmid of interest are mixed directly with cultured 
mammalian cells. After fusion of the cell membranes (usually with polyethylene 
glycol), the contents of the bacteria are delivered into the cytoplasm of the 
mammalian cells and the plasmid DNA is transported to the nucleus. Protoplast 
fusion is not as efficient as transfection for many of the cell lines that are commonly 

15 used for transient expression assays, but it is useful for cell lines in which 
endocytosis of DNA occurs inefficiently. Protoplast fusion frequently yields multiple 
copies of the plasmid DNA tandemly integrated into the host chromosome. 

The application of brief, high-voltage electric pulses to a variety of mammalian 
and plant cells leads to the formation of nanometer-sized pores in the plasma 

20 membrane. DNA is taken directly into the cell cytoplasm either through these pores 
or as a consequence of the redistribution of membrane components that 
accompanies closure of the pores. Electroporation can be extremely efficient and can 
be used both for transient expression of cloned genes and for establishment of cell 
lines that carry integrated copies of the gene of interest. Electroporation, in contrast 

25 to calcium phosphate-mediated transfection and protoplast fusion, frequently gives 
rise to cell lines that carry one, or at most a few, integrated copies of the foreign 
DNA. 

Liposome transfection involves encapsulation of DNA and RNA within 
liposomes, followed by fusion of the liposomes with the cell membrane. The 
30 mechanism of how DNA is delivered into the cell is unclear but transfection 
efficiencies can be as high as 90%. 

Direct microinjection of a DNA molecule into nuclei has the advantage of not 
exposing DNA to cellular compartments such as low-pH endosomes. Microinjection 
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is therefore used primarily as a method to establish lines of cells that carry integrated 
copies of the DNA of interest. 

The use of adenovirus as a vector for cell transfection is well known in the art. 
Adenovirus vector-mediated cell transfection has been reported for various cells 
5 (Stratford-Perricaudet, et al. 1 992). 

A transfected cell can be prokaryotic or eukaryotic. Preferably, the host cells 
of the invention are prokaryotic host cells. Where it is of interest to produce a 
Streptococcus pneumoniae polypeptide, cultured prokaryotic host cells are of 
particular interest. 

10 In yet another embodiment, the present invention contemplates a process or 

method of preparing Streptococcus pneumoniae polypeptides comprising 
transforming, transfecting or infecting cells with a polynucleotide that encodes a 
Streptococcus pneumoniae polypeptide to produce transformed host cells; and 
maintaining the transformed host cells under biological conditions sufficient for 

15 expression of the polypeptide. Preferably, the transformed host cells are prokaryotic 
cells. Alternatively, the host cells are eukaryotic cells. More preferably, the 
prokaryotic cells are bacterial cells of the DH5-oc strain of Escherichia coli. Even more 
preferably, the polynucleotide transfected into the transformed cells comprise the 
nucleic acid sequence of one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 

20 NO: 431 through SEQ ID NO: 591. Additionally, transfection is accomplished using 
an expression vector disclosed above. A host cell used in the process is capable of 
expressing a functional, recombinant Streptococcus pneumoniae polypeptide. 

Following transfection, the cell is maintained under culture conditions for a 
period of time sufficient for expression of a Streptococcus pneumoniae polypeptide. 

25 Culture conditions are well known in the art and include ionic composition and 
concentration, temperature, pH and the like. Typically, transfected cells are 
maintained under culture conditions in a culture medium. Suitable media for various 
cell types are well known in the art. In a preferred embodiment, temperature is from 
about 20°C to about 50°C, more preferably from about 30°C to about 40°C and, even 

30 more preferably about 37°C. 

The pH is preferably from about a value of 6.0 to a value of about 8.0? more 
preferably from about a value of about 6.8 to a value of about 7.8 and, most 
preferably about 7.4. Osmolality is preferably from about 200 milliosmols per liter 
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(mosm/L) to about 400 mosm/l and, more preferably from about 290 mosm/L to 
about 310 mosm/L. Other biological conditions needed for transfection and 
expression of an encoded protein are well known in the art. 

Transfected cells are maintained for a period of time sufficient for expression 
5 of an Streptococcus pneumoniae polypeptide. A suitable time depends inter alia 
upon the cell type used and is readily determinable by a skilled artisan. Typically, 
maintenance time is from about 2 to about 14 days. 

Recombinant Streptococcus pneumoniae polypeptide is recovered or 
collected either from the transfected cells or the medium in which those cells are 
10 cultured. Recovery comprises isolating and purifying the Streptococcus pneumoniae 
polypeptide. Isolation and purification techniques for polypeptides are well known in 
the art and include such procedures as precipitation, filtration, chromatography, 
electrophoresis and the like. 

15 F. ANTIBODIES IMMUNOREACTIVE WITH STREPTOCOCCUS PNEUMONIAE 

Polypeptides 

In still another embodiment, the present invention provides antibodies 
immunoreactive with Streptococcus pneumoniae polypeptides. Preferably, the 
antibodies of the invention are monoclonal antibodies. Additionally, the 

20 Streptococcus pneumoniae polypeptides comprise the amino acid residue sequence 
of one of SEQ ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 through SEQ 
ID NO: 752. Means for preparing and characterizing antibodies are well known in the 
art (See, e.g., Antibodies "A Laboratory Manual", E. Harlow and D. Lane, Cold Spring 
Harbor Laboratory, 1988). 

25 Briefly, a polyclonal antibody is prepared by immunizing an animal with an 

immunogen comprising a polypeptide or polynucleotide of the present invention, and 
collecting antisera from that immunized animal. A wide range of animal species can 
be used for the production of antisera. Typically an animal used for production of 
anti-antisera is a rabbit, a mouse, a rat, a hamster or a guinea pig. Because of the 

30 relatively large blood volume of rabbits, a rabbit is a preferred choice for production 
of polyclonal antibodies. 

As is well known in the art, a given polypeptide or polynucleotide may vary in 
its immunogenicity. It is often necessary therefore to couple the immunogen (e.g., a 
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polypeptide or polynucleotide) of the present invention with a carrier. Exemplary and 
preferred carriers are CRM 197 , keyhole limpet hemocyanin (KLH) and bovine serum 
albumin (BSA). Other albumins such as ovalbumin, mouse serum albumin or rabbit 
serum albumin can also be used as carriers. 
5 Means for conjugating a polypeptide or a polynucleotide to a carrier protein 

are well known in the art and include glutaraldehyde, m-maleimidobencoyl-N- 
hydroxysuccinimide ester, carbodiimide and bis-biazotized benzidine. 

The amount of immunogen used for the production of polyclonal antibodies 
varies inter alia, upon the nature of the immunogen as well as the animal used for 

10 immunization. A variety of routes can be used to administer the immunogen 
(subcutaneous, intramuscular, intradermal, intravenous and intraperitoneal). The 
production of polyclonal antibodies is monitored by sampling blood of the immunized 
animal at various points following immunization. When a desired level of 
immunogenicity is obtained, the immunized animal can be bled and the serum 

15 isolated and stored. 

In another aspect, the present invention contemplates a process of producing 
an antibody immunoreactive with a Streptococcus pneumoniae polypeptide 
comprising the steps of (a) transfecting recombinant host cells with a polynucleotide 
that encodes a Streptococcus pneumoniae polypeptide; (b) culturing the host cells 

20 under conditions sufficient for expression of the polypeptide; (c) recovering the 
polypeptides; and (d) preparing the antibodies to the polypeptides. Preferably, the 
host cell is transfected with the polynucleotide of one of SEQ ID NO:1 through SEQ 
ID NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591. Even more preferably, the 
present invention provides antibodies prepared according to the process described 

25 above. 

A monoclonal antibody of the present invention can be readily prepared 
through use of well-known techniques such as those exemplified in U.S. Patent 
4,196,265, hereinafter incorporated by reference. Typically, a technique involves first 
immunizing a suitable animal with a selected antigen (e.g., a polypeptide or 
30 polynucleotide of the present invention) in a manner sufficient to provide an immune 
response. Rodents, such as mice and rats, are preferred animals. Spleen cells from 
the immunized animal are then fused with cells of an immortal myeloma cell. Where 
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the immunized animal is a mouse, a preferred myeloma cell is a murine NS-1 
myeloma cell. 

The fused spleen/myeloma cells are cultured in a selective medium to select 
fused spleen/myeloma cells from the parental cells. Fused cells are separated from 
5 the mixture of non-fused parental cells, e.g., by the addition of agents that block the 
de novo synthesis of nucleotides in the tissue culture media. Exemplary and 
preferred agents are aminopterin, methotrexate, and azaserine. Aminopterin and 
methotrexate block de novo synthesis of both purines and pyrimidines, whereas 
azaserine blocks only purine synthesis. Where aminopterin or methotrexate is used, 
10 the media is supplemented with hypoxanthine and thymidine as a source of 
nucleotides. Where azaserine is used, the media is supplemented with 
hypoxanthine. 

This culturing provides a population of hybridomas from which specific 
hybridomas are selected. Typically, selection of hybridomas is performed by 
15 culturing the cells by single-clone dilution in microtiter plates, followed by testing the 
individual clonal supernatants for reactivity with an antigen-polypeptide. The 
selected clones can then be propagated indefinitely to provide the monoclonal 
antibody. 

By way of specific example, to produce an antibody of the present invention, 
20 mice are injected intraperitoneal^ with between about 1-200 |^g of an antigen 
comprising a polypeptide of the present invention. B lymphocyte cells are stimulated 
to grow by injecting the antigen in association with an adjuvant such as complete 
Freund's adjuvant (a non-specific stimulator of the immune response containing killed 
Mycobacterium tuberculosis). At some time (e.g., at least two weeks) after the first 
25 injection, mice are boosted by injection with a second dose of the antigen mixed with 
incomplete Freund's adjuvant. 

A few weeks after the second injection, mice are tail bled and the sera titered 
by immunoprecipitation against radiolabeled antigen. Preferably, the process of 
boosting and titering is repeated until a suitable titer is achieved. The spleen of the 
30 mouse with the highest titer is removed and the spleen lymphocytes are obtained by 
homogenizing the spleen with a syringe. Typically, a spleen from an immunized 
mouse contains approximately 5x1 0 7 to 2x1 0 8 lymphocytes. 
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Mutant lymphocyte cells known as myeloma cells are obtained from 
laboratory animals in which such cells have been induced to grow by a variety of 
well-known methods. Myeloma cells lack the salvage pathway of nucleotide 
biosynthesis. Because myeloma cells are tumor cells, they can be propagated 
5 indefinitely in tissue culture, and are thus denominated immortal. Numerous cultured 
cell lines of myeloma cells from mice and rats, such as murine NS-1 myeloma cells, 
have been established. 

Myeloma cells are combined under conditions appropriate to foster fusion 
with the normal antibody-producing cells from the spleen of the mouse or rat injected 

10 with the antigen/polypeptide of the present invention. Fusion conditions include, for 
example, the presence of polyethylene glycol. The resulting fused cells are 
hybridoma cells. Like myeloma cells, hybridoma cells grow indefinitely in culture. 

Hybridoma cells are separated from unfused myeloma cells by culturing in a 
selection medium such as HAT media (hypoxanthine, aminopterin, thymidine). 

15 Unfused myeloma cells lack the enzymes necessary to synthesize nucleotides from 
the salvage pathway because they are killed in the presence of aminopterin, 
methotrexate, or azaserine. Unfused lymphocytes also do not continue to grow in 
tissue culture. Thus, only cells that have successfully fused (hybridoma cells) can 
grow in the selection media. 

20 Each of the surviving hybridoma cells produces a single antibody. These 

cells are then screened for the production of the specific antibody immunoreactive 
with an antigen/polypeptide of the present invention. Single cell hybridomas are 
isolated by limiting dilutions of the hybridomas. The hybridomas are serially diluted 
many times and, after the dilutions are allowed to grow, the supernatant is tested for 

25 the presence of the monoclonal antibody. The clones producing that antibody are 
then cultured in large amounts to produce an antibody of the present invention in 
convenient quantity. 

By use of a monoclonal antibody of the present invention, specific 
polypeptides and polynucleotide of the invention are identified as antigens. Once 

30 identified, those polypeptides and polynucleotides are isolated and purified by 
techniques such as antibody-affinity chromatography. In antibody-affinity 
chromatography, a monoclonal antibody is bound to a solid substrate and exposed to 
a solution containing the desired antigen. The antigen is removed from the solution 
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through an immunospecific reaction with the bound antibody. The polypeptide or 
polynucleotide is then easily removed from the substrate and purified. 

Additionally, examples of methods and reagents particularly amenable for use 
in generating and screening antibody display library can be found in, for example, 
5 U.S. Patent 5,223,409; International Application WO 92/18619; International 
Application WO 91/17271; International Application WO 92/20791; International 
Application WO 92/15679; International Application WO 93/01288; International 
Application WO 92/01047; International Application WO 92/09690; International 
Application WO 90/02809. 

10 Additionally, recombinant anti-Sfrepfococcus pneumoniae antibodies, such as 

chimeric and humanized monoclonal antibodies, comprising both human and non- 
human fragments, which can be made using standard recombinant DNA techniques, 
are within the scope of the invention. Such chimeric and humanized monoclonal 
antibodies can be produced by recombinant DNA techniques known in the art, for 

15 example using methods described in International Application PCT/US86/02269; 
International Application EP 184,187; International Application EP 171,496; 
International Application EP 173,494; International Application WO 86/01533; U.S. 
Patent 4,816,567; and International Application EP 125,023. 

An anti-Sfrepfococcus pneumoniae antibody (e.g., monoclonal antibody) is 

20 used to isolate Streptococcus pneumoniae polypeptides by standard techniques, 
such as affinity chromatography or immunoprecipitation. An anti-Streptococcus 
pneumoniae antibody facilitates the purification of a natural Streptococcus 
pneumoniae polypeptide from cells and recombinantly produced Streptococcus 
pneumoniae polypeptides expressed in host cells. Moreover, an anti-Sfrepfococcus 

25 pneumoniae antibody is used to detect Steptococcus pneumoniae polypeptide (e.g., 
in a cellular lysate or cell supernatant) in order to evaluate the abundance of the 
Streptococcus pneumoniae polypeptide. The detection of circulating fragments of a 
Streptococcus pneumoniae polypeptide is used to identify Streptococcus 
pneumoniae polypeptide turnover in a subject. Anti-Sfrepfococctys pneumoniae 

30 antibodies are used diagnostically to monitor protein levels in tissue as part of a 
clinical testing procedure, e.g., to, for example, determine the efficacy of a given 
treatment regimen. Detection is facilitated by coupling {i.e., physically linking) the 
antibody to a detectable substance. Examples of detectable substances include 
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various enzymes, prosthetic groups, fluorescent materials, luminescent materials, 
bioluminescent materials, and radioactive materials. Examples of suitable enzymes 
include horseradish peroxidase, alkaline phosphatase, P-galactosidase, or 
acetylcholinesterase; examples of suitable prosthetic group complexes include 
5 streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials 
include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, 
dichlorotriazinylarnine fluorescein, dansyl chloride or phycoerythrin; an example of a 
luminescent material includes luminol; examples of bioluminescent materials include 
luciferase, luciferin, and acquorin, and examples of suitable radioactive material 
10 include 125 I, 131 I, 15 S or 3 H. 

G. PHARMACEUTICAL AND IMMUNOGENIC COMPOSITIONS 

In certain embodiments, the present invention provides pharmaceutical and 
immunogenic compositions comprising Streptococcus pneumoniae polypeptides and 

15 physiologically acceptable carriers. More preferably, the pharmaceutical 
compositions comprise one or more Streptococcus pneumoniae polypeptides 
comprising the amino acid residue sequence of one or more of SEQ ID NO:216 
through SEQ ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752. In other 
embodiments, the pharmaceutical compositions of the invention comprise 

20 polynucleotides that encode Streptococcus pneumoniae polypeptides, and 
physiologically acceptable carriers. Preferably, the pharmaceutical and immunogenic 
compositions of the present invention comprise Streptococcus pneumoniae 
polypeptides comprising the amino acid sequence of one of SEQ ID NO:216 through 
SEQ ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752. Alternatively, the 

25 pharmaceutical and immunogenic compositions comprise polynucleotides comprising 
the nucleotide sequence of one of SEQ ID NO:1 through SEQ ID NO:215 or SEQ ID 
NO: 431 through SEQ ID NO: 591. 

Various tests are used to assess the in vitro immunogenicity of the 
polypeptides of the invention. For example, an in vitro opsonic assay is conducted 

30 by incubating together a mixture of Sfrepfococct/s pneumoniae cells, heat inactivated 
human serum containing specific antibodies to the polypeptide in question, and an 
exogenous complement source. Opsonophagocytosis proceeds during incubation of 
freshly isolated human polymorphonuclear cells (PMN's) and the 
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antibody/complement/pneumococcal cell mixture. Bacterial cells that are coated with 
antibody and complement are killed upon opsonophagocytosis. Colony forming units 
(cfu) of surviving bacteria that escape from opsonophagocytosis are determined by 
plating the assay mixture. Titers are reported as the reciprocal of the highest dilution 
5 that gives > 50% bacterial killing, as determined by comparison to assay controls. 
Specimens which demonstrate less than 50% killing at the lowest serum dilution 
tested (1:8), are reported as having an OPA titer of 4. The highest dilution tested is 
1 :2560. Samples with > 50% killing at the highest dilution are repeated, beginning 
with a higher initial dilution. The method described above is a modification of Gray's 

10 method (Gray, 1990). 

A test serum control, which contains test serum plus bacterial cells and heat 
inactivated complement, is included for each individual serum. This control can be 
used to assess whether the presence of antibiotics or other serum components are 
capable of killing the bacterial strain directly (i.e. in the absence of complement or 

15 PMN's). A human serum with known opsonic titer is used as a positive human serum 
control. The opsonic antibody titer for each unknown serum can be calculated as the 
reciprocal of the initial dilution of serum giving 50% cfu reduction compared to the 
control without serum. 

A whole cell ELISA assay is also used to assess in vitro immunogenicity and 

20 surface exposure of the polypeptide antigen, wherein the bacterial strain of interest 
(S. pneumoniae) is coated onto a plate, such as a 96 well plate, and test sera from 
an immunized animal is reacted with the bacterial cells. If any antibody, specific for 
the test polypeptide antigen, is reactive with a surface exposed epitope of the 
polypeptide antigen, it can be detected by standard methods known to one skilled in 

25 the art. 

Any polypeptide demonstrating the desired in vitro activity is then tested in an 
in vivo animal challenge model. In certain embodiments, immunogenic compositions 
are used in the immunization of an animal (e.g., a mouse) by methods and routes of 
immunization known to those of skill in the art (e.g., intranasal, parenteral, oral, 
30 rectal, vaginal, transdermal, intraperitoneal, intravenous, subcutaneous, etc.). 
Following immunization of the animal with a particular Streptococcus pneumoniae 
immunogenic composition, the animal is challenged with Streptococcus pneumoniae 
and assayed for resistance to Streptococcus pneumoniae infection. 
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In one embodiment, six-week old, pathogen-free, Balb/c mice are immunized 
and challenged with Streptococcus pneumoniae. For example, BALB/C mice, at 1 0 
animals per group, are immunized (by slow instillation into the nostrils of each 
mouse) with one or more doses of the desired polypeptide in an immunogenic 
5 composition. Streptococcus pneumoniae colonizes the nasopharynx of Balb/c mice, 
but does not cause disease or death. Subsequently, the Balb/c mice are challenged 
with streptomycin-resistant Streptococcus pneumoniae. The Balb/c mice are 
sacrificed post-challenge, the noses removed, and homogenized in sterile saline. 
The homogenate is diluted in saline and plated on streptomycin-containing TSA 

10 plates. Plates are incubated overnight at 37°C and then colonies are counted. 
Statistically significant reduction of nasopharyngeal colonization indicates that the 
polypeptide is suitable for use in human clinical trials. 

In another embodiment, six-week old, pathogen-free, male CBA/CaHN xid/J 
(CBA/N) mice are immunized intranasally or parenterally prior to Streptococcus 

15 pneumoniae challenge. CBA/N mice, at 10 animals per group, are immunized with 
an appropriate amount of the desired polypeptide in an immunogenic composition to 
be tested. CBA/N mice are immunodeficient (XI D) and, when challenged with 
appropriate Streptococcus pneumoniae, develop nasopharyngeal colonization, 
bacteremia and death. 

20 The CBA/N mice are immunized intranasally or subcutaneously with one or 

more doses of the desired immunogenic composition. Subsequently, the CBA/N 
mice are challenged with streptomycin-resistant Streptococcus pneumoniae. To 
determine the effects of immunization on intranasal colonization, the CBA/N mice are 
sacrificed post-challenge, the noses are removed, and homogenized in sterile saline. 

25 The homogenate is serially diluted in saline and plated on streptomycin-containing 
TSA plates. In addition, blood collected post-challenge from each mouse is also 
plated on streptomycin-containing TSA plates to determine levels of bacteremia. 
Plates are incubated overnight at 37°C and then colonies are counted. In another 
embodiment, CBA/N mice are immunized as described above and challenged 

30 intranasally. The CBA/N mice are observed daily after challenge, and the mortality is 
monitored for 14 days. Statistically significant reduction of nasopharyngeal 
colonization and/or mortality indicates that the polypeptide is suitable for use in 
human clinical trials. 
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The Streptococcus pneumoniae polynucleotides, polypeptides, modulators of 
a Streptococcus pneumoniae polypeptides, and anti-Sfrepfococcus pneumoniae 
antibodies (also referred to hereinafter as "active compounds") of the invention are 
incorporated into pharmaceutical and immunogenic compositions suitable for 
5 administration to a subject, e.g., a human. Such compositions typically comprise the 
nucleic acid molecule, protein, modulator, or antibody and a pharmaceutically 
acceptable carrier. As used hereinafter the language "pharmaceutically acceptable 
carrier" is intended to include any and all solvents, dispersion media, coatings, 
antibacterial and antifungal agents, isotonic and absorption delaying agents, and the 

10 like, compatible with pharmaceutical administration. The use of such media and 
agents for pharmaceutically active substances is well known in the art. Except 
insofar as any conventional media or agent is incompatible with the active 
compound, such media can be used in the compositions of the invention. 
Supplementary active compounds can also be incorporated into the compositions. 

15 A pharmaceutical or immunogenic composition of the invention is formulated 

to be compatible with its intended route of administration. Examples of routes of 
administration include parenteral (e.g., intravenous, intradermal, subcutaneous, 
intraperitoneal), transmucosal (e.g., oral, rectal, intranasal, vaginal, respiratory) and 
transdermal (topical). Solutions or suspensions used for parenteral, intradermal, or 

20 subcutaneous application can include the following components: a sterile diluent 
such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, 
propylene glycol or other synthetic solvents; antibacterial agents such as benzyl 
alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; 
chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, 

25 citrates or phosphates and agents for the adjustment of tonicity such as sodium 
chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric 
acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, 
disposable syringes or multiple dose vials made of glass or plastic. 

Pharmaceutical compositions suitable for injectable use include sterile 

30 aqueous solutions (where water soluble) or dispersions and sterile powders for the 
extemporaneous preparation of sterile injectable solutions or dispersion. For 
intravenous administration, suitable carriers include physiological saline, 
bacteriostatic water, Cremophor ELTM(BASF, Parsippany, NJ) or phosphate 
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buffered saline (PBS). In all cases, the composition must be sterile and should be 
fluid to the extent that easy syringability exists. It must be stable under the conditions 
of manufacture and storage and must be preserved against the contaminating action 
of microorganisms such as bacteria and fungi. The carrier can be a solvent or 
5 dispersion medium containing, for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable 
mixtures thereof. The proper fluidity can be maintained, for example, by the use of a 
coating such as lecithin, by the maintenance of the required particle size in the case 
of dispersion and by the use of surfactants. Prevention of the action of 

10 microorganisms can be achieved by various antibacterial and antifungal agents, for 
example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In 
many cases, it will be preferable to include isotonic agents, for example, sugars, 
polyalcohols such as manitol, sorbitol, sodium chloride in the composition. 
Prolonged absorption of the injectable compositions can be brought about by 

15 including in the composition an agent which delays absorption, for example, 
aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active 
compound (e.g., a Streptococcus pneumoniae polypeptide or anti-Sf/epfococcus 
pneumoniae antibody) in the required amount in an appropriate solvent with one or a 

20 combination of ingredients enumerated above, as required, followed by filtered 
sterilization. Generally, dispersions are prepared by incorporating the active 
compound into a sterile vehicle which contains a basic dispersion medium and the 
required other ingredients from those enumerated above. In the case of sterile 
powders for the preparation of sterile injectable solutions, the preferred methods of 

25 preparation are vacuum drying and freeze-drying which yields a powder of the active 
ingredient plus any additional desired ingredient from a previously sterile-filtered 
solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They 
can be enclosed in gelatin capsules or compressed into tablets. For the purpose of 
30 oral therapeutic administration, the active compound can be incorporated with 
excipients and used in the form of tablets, troches, or capsules. Oral compositions 
can also be prepared using a fluid carrier for use as a mouthwash, wherein the 
compound in the fluid carrier is applied orally and swished and expectorated or 
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swallowed. Pharmaceutical^ compatible binding agents, and/or adjuvant materials 
can be included as part of the composition. The tablets, pills, capsules, troches and 
the like can contain any of the following ingredients, or compounds of a similar 
nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an 
5 excipient such as starch or lactose, a disintegrating agent such as alginic acid, 
Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a 
glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or 
saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange 
flavoring. 

10 For administration by inhalation, the compounds are delivered in the form of 

an aerosol spray from pressured container or dispenser which contains a suitable 
propellant, e.g., a gas such as carbon dioxide, or a nebulizer. Systemic 
administration can also be by transmucosal or transdermal means. For transmucosal 
or transdermal administration, penetrants appropriate to the barrier to be permeated 

15 are used in the formulation. Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and 
fusidic acid derivatives. Transmucosal administration can be accomplished through 
the use of nasal sprays or suppositories. For transdermal administration, the active 
compounds are formulated into ointments, salves, gels, or creams as generally 

20 known in the art. 

The compounds can also be prepared in the form of suppositories (e.g., with 
conventional suppository bases such as cocoa butter and other glycerides) or 
retention enemas for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will 

25 protect the compound against rapid elimination from the body, such as a controlled 
release formulation, including implants and microencapsulated delivery systems. 

Biodegradable, biocompatible polymers can be used, such as ethylene vinyl 
acetate, polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic 
acid. Methods for preparation of such formulations will be apparent to those skilled in 

30 the art. The materials can also be obtained commercially from Alza Corporation and 
Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to 
infected cells with monoclonal antibodies to viral antigens) can also be used as 
pharmaceutically acceptable carriers. These can be prepared according to methods 
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known to those skilled in the art, for example, as described in U.S. Patent 4,522,81 1 

which is incorporated hereinafter by reference. 

It is especially advantageous to formulate oral or parenteral compositions in 

dosage unit form for ease of administration and uniformity of dosage. Dosage unit 
5 form as used hereinafter refers to physically discrete units suited as unitary dosages 

for the subject to be treated; each unit containing a predetermined quantity of active 

compound calculated to produce the desired therapeutic effect in association with the 

required pharmaceutical carrier. The specification for the dosage unit forms of the 

invention are dictated by and directly dependent on the unique characteristics of the 
10 active compound and the particular therapeutic effect to be achieved, and the 

limitations inherent in the art of compounding such an active compound for the 

treatment of individuals. 

Combination immunogenic compositions are provided by including two or 

more of the polypeptides of the invention, as well as by combining one or more of the 
15 polypeptides of the invention with one or more known S. pyogenes polypeptides, 

including, but not limited to, the C5a peptidase, the M proteins, adhesins and the like. 
In other embodiments, combination immunogenic compositions are provided 

by combining one or more of the polypeptides of the invention with one or more 

known S. pneumoniae polysaccharides or polysaccharide-protein conjugates, 
20 including, but not limited to, the currently available 23-valent pneumococcal capsular 

polysaccharide vaccine and the 7-valent pneumococcal polysaccharide-protein 

conjugate vaccine. 

The nucleic acid molecules of the invention are inserted into a variety of 
vectors and expression systems. A great variety of expression systems are used. 

25 Such systems include, among others, chromosomal, episomal and virus-derived 
systems, e.g., vectors derived from bacterial plasmids, attenuated bacteria such as 
Salmonella (U.S. Patent 4,837,151) from bacteriophage, from transposons, from 
yeast episomes, from insertion elements, from yeast chromosomal elements, from 
viruses such as vaccinia and other poxviruses, sindbis, adenovirus, baculoviruses, 

30 papova viruses, such as SV40, fowl pox viruses, pseudorabies viruses and 
retroviruses, alphaviruses such as Venezuelan equine encephalitis virus (U.S. Patent 
5,643,576), nonsegmented negative-stranded RNA viruses such as vesicular 
stomatitis virus (U.S. Patent 6,168,943), and vectors derived from combinations 
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thereof, such as those derived from plasmid and bacteriophage genetic elements, 
such as cosmids and phagemids. The expression systems should include control 
regions that regulate as well as engender expression, such as promoters and other 
regulatory elements (such as a polyadenylation signal). Generally, any system or 
5 vector suitable to maintain, propagate or express polynucleotides to produce a 
polypeptide in a host may be used. The appropriate nucleotide sequence may be 
inserted into an expression system by any of a variety of well-known and routine 
techniques, such as, for example, those set forth in Sambrook et a/., "Molecular 
Cloning: A Laboratory Manual" 2nd, ed, Cold Spring Harbor Laboratory, Cold Spring 

10 Harbor Laboratory Press, Cold Spring Harbor, NY, 1989. 

A pharmaceutical^ acceptable vehicle is understood to designate a 
compound or a combination of compounds entering into a pharmaceutical or 
immunogenic composition which does not cause side effects and which makes it 
possible, for example, to facilitate the administration of the active compound, to 

15 increase its life and/or its efficacy in the body, to increase its solubility in solution or 
alternatively to enhance its preservation. These pharmaceutical^ acceptable vehicles 
are well known and will be adapted by persons skilled in the art according to the 
nature and the mode of administration of the active compound chosen. 

As defined hereinafter, an "adjuvant" is a substance that serves to enhance 

20 the immunogenicity of an "antigen" or the immunogenic compositions comprising a 
polypeptide antigens having an amino acid sequence chosen from one of SEQ ID 
NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752. 
Thus, adjuvants are often given to boost the immune response and are well known to 
the skilled artisan. Examples of adjuvants contemplated in the present invention 

25 include, but are not limited to, aluminum salts (alum) such as aluminum phosphate 
and aluminum hydroxide, Mycobacterium tuberculosis, Bordetella pertussis, bacterial 
lipopolysaccharides, aminoalkyl glucosamine phosphate compounds (AGP), or 
derivatives or analogs thereof, which are available from Corixa (Hamilton, MT), and 
which are described in United States Patent Number 6,113,918; one such AGP is 2- 

30 [(R)-3-Tetradecanoyloxytetradecanoyiamino]ethyl 2-Deoxy-4-0-phosphono-3-0-[(R)- 
3-tetradecanoyoxytetradecanoyl]-2-[(R)-3-tetradecanoyoxytetradecanoylamino]-b-D- 
glucopyranoside, which is also known as 529 (formerly known as RC529), which is 
formulated as an aqueous form or as a stable emulsion, MPL™ (3-O-deacylated 
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monophosphoryl lipid A) (Corixa) described in U.S. Patent Number 4,912,094, 
synthetic polynucleotides such as oligonucleotides containing a CpG motif (U.S. 
Patent Number 6,207,646), polypeptides, saponins such as Quil A or STIMULON™ 
QS-21 (Antigenics, Framingham, Massachusetts), described in U.S. Patent Number 
5 5,057,540, a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), particularly LT- 
K63, LT-R72, CT-S109, PT-K9/G129; see, e.g., International Patent Publication Nos. 
WO 93/13302 and WO 92/19265, cholera toxin (either in a wild-type or mutant form, 
e.g., wherein the glutamic acid at amino acid position 29 is replaced by another 
amino acid, preferably a histidine, in accordance with published International Patent 

10 Application number WO 00/18434). Various cytokines and lymphokines are suitable 
for use as adjuvants. One such adjuvant is granulocyte-macrophage colony 
stimulating factor (GM-CSF), which has a nucleotide sequence as described in U.S. 
Patent Number 5,078,996. A plasmid containing GM-CSF cDNA has been 
transformed into E. coli and has been deposited with the American Type Culture 

15 Collection (ATCC), 1081 University Boulevard, Manassas, VA 20110-2209, under 
Accession Number 39900. The cytokine lnterleukin-12(IL-12) is another adjuvant 
which is described in U.S. Patent Number 5,723,127. Other cytokines or 
lymphokines have been shown to have immune modulating activity, including, but not 
limited to, the interleukins 1-alpha, 1-beta, 2, 4, 5,6, 7, 8, 10, 13, 14, 15, 16, 17 and 

20 18, the interferons-alpha, beta and gamma, granulocyte colony stimulating factor, 
and the tumor necrosis factors alpha and beta, and are suitable for use as adjuvants. 

A composition of the present invention is typically administered parenterally in 
dosage unit formulations containing standard, well-known nontoxic physiologically 
acceptable carriers, adjuvants, and vehicles as desired. The term parenteral as used 

25 hereinafter includes intravenous, intra-muscular, intraarterial injection, or infusion 
techniques. 

Injectable preparations, for example sterile injectable aqueous or oleaginous 
suspensions, are formulated according to the known art using suitable dispersing or 
wetting agents and suspending agents. The sterile injectable preparation can also be 
30 a sterile injectable solution or suspension in a nontoxic parenterally acceptable 
diluent or solvent, for example, as a solution in 1,3-butanediol. 

Among the acceptable vehicles and solvents that may be employed are 
water, Ringer's solution, and isotonic sodium chloride solution. In addition, sterile, 
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fixed oils are conventionally employed as a solvent or suspending medium. For this 
purpose any bland fixed oil can be employed including synthetic mono- or di- 
glycerides. In addition, fatty acids such as oleic acid find use in the preparation of 
injectables. 

5 Preferred carriers include neutral saline solutions buffered with phosphate, 

lactate, Tris, and the like. Of course, when administering viral vectors, one purifies 
the vector sufficiently to render it essentially free of undesirable contaminants, such 
as defective interfering adenovirus particles or endotoxins and other pyrogens such 
that it does not cause any untoward reactions in the individual receiving the vector 
10 construct. A preferred means of purifying the vector involves the use of buoyant 
density gradients, such as cesium chloride gradient centrifugation. 

A carrier can also be a liposome. Means for using liposomes as delivery 
vehicles are well known in the art (see, e.g. Gabizon et a/., 1990; Ferruti ef a/., 1986; 
and Ranade, 1989). 

15 The immunogenic compositions of this invention also comprise a 

polynucleotide sequence of this invention operatively associated with a regulatory 
sequence that controls gene expression. The polynucleotide sequence of interest is 
engineered into an expression vector, such as a plasmid, under the control of 
regulatory elements which will promote expression of the DNA, that is, promoter 

20 and/or enhancer elements. In a preferred embodiment, the human cytomegalovirus 
immediate-early promoter/enhancer is used (U.S. Patent 5,168,062). The promoter 
may be cell-specific and permit substantial transcription of the polynucleotide only in 
predetermined cells. 

The polynucleotide is introduced directly into the host either as "naked" DNA 

25 (U.S. Patent 5,580,859) or formulated in compositions with agents which facilitate 
immunization, such as bupivicaine and other local anesthetics (U.S. Patent 
5,593,972) and cationic polyamines (U.S. Patent 6,127,170). 

In this polynucleotide immunization procedure, the polypeptides of the 
invention are expressed on a transient basis in vivo; no genetic material is inserted or 

30 integrated into the chromosomes of the host. This procedure is to be distinguished 
from gene therapy, where the goal is to insert or integrate the genetic material of 
interest into the chromosome. An assay is used to confirm that the polynucleotides 
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administered by immunization do not give rise to a transformed phenotype in the host 
(U.S. Patent 6,168,918). 

H. Uses and Methods of the Invention 

5 The Streptococcus pneumoniae polynucleotides, polypeptides, polypeptide 

homologues, modulators, adjuvants, and antibodies described in this invention can 
be used in methods of treatment, diagnostic assays particularly in disease 
identification, drug screening assays and monitoring of effects during clinical trials. 
The isolated polynucleotides of the invention can be used to express Streptococcus 

10 pneumoniae polypeptides (e.g., via a recombinant expression vector in a host cell or 
in polynucleotide immunization applications) and to detect Streptococcus 
pneumoniae mRNA (e.g., in a biological sample). Moreover, the anti-Streptococcus 
pneumoniae antibodies of the invention can be used to detect and isolate a 
Streptococcus pneumoniae polypeptide, particularly fragments of a Streptococcus 

15 pneumoniae polypeptides present in a biological sample, and to modulate 
Streptococcus pneumoniae polypeptide activity. 

The invention provides immunogenic compositions comprising polypeptides 
having an amino acid sequence chosen from one of SEQ ID NO:216 through SEQ ID 
NO:430 or SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent thereof 

20 or a fragment thereof. The immunogenic composition may further comprise a 
pharmaceutically acceptable carrier, as outlined in section G. In certain preferred 
embodiments, the immunogenic composition will comprise one or more adjuvants. 

In another embodiment, the invention provides immunogenic compositions 
comprising a polynucleotide having a nucleotide sequence chosen from one of SEQ 

25 ID NO:1 through SEQ ID NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, 
wherein the polynucleotide is comprised in a recombinant expression vector. 
Preferably the vector is plasmid DNA. Of course, the polynucleotide may further 
comprise heterologous nucleotides, e.g., the polynucleotide is operatively linked to 
one or more gene expression regulatory elements, and further comprise one or more 

30 adjuvants. In a preferred embodiment, the immunogenic polynucleotide composition 
directs the expression of a neutralizing epitope of Streptococcus pneumoniae. 

Provided also are methods for immunizing a host against Streptococcus 
pneumoniae infection. In a preferred embodiment, the host is human. Thus, a host 
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or subject is administered an immunizing amount of an immunogenic composition 
comprising a polypeptide having an amino acid sequence chosen from one of SEQ 
ID NO:216 through SEQ ID NO:430 or SEQ ID NO: 592 through 752, a biological 
equivalent thereof or a fragment thereof and a pharmaceutical^ acceptable carrier. 
5 An immunizing amount of an immunogenic composition can be determined by doing 
a dose response study in which subjects are immunized with gradually increasing 
amounts of the immunogenic composition and the immune response analyzed to 
determine the optimal dosage. Starting points for the study can be inferred from 
immunization data in animal models. The dosage amount can vary depending upon 

10 specific conditions of the individual. The amount can be determined in routine trials 
by means known to those skilled in the art. 

An immunologically effective amount of the immunogenic composition in an 
appropriate number of doses is administered to the subject to elicit an immune 
response. Immunologically effective amount, as used herein, means the 

15 administration of that amount to a mammalian host (preferably human), either in a 
single dose or as part of a series of doses, sufficient to at least cause the immune 
system of the individual treated to generate a response that reduces the clinical 
impact of the bacterial infection. Protection may be conferred by a single dose of the 
immunogenic composition or vaccine, or may require the administration of several 

20 doses, in addition to booster doses at later times to maintain protection. This may 
range from a minimal decrease in bacterial burden to prevention of the infection. 
Ideally, the treated individual will not exhibit the more serious clinical manifestations 
of the Streptococcus pneumoniae infection. The dosage amount can vary depending 
upon specific conditions of the individual, such as age and weight. This amount can 

25 be determined in routine trials by means known to those skilled in the art. 

I. Diagnostic Assays 

The invention provides methods for detecting the presence of a 
Streptococcus pneumoniae polypeptide or Streptococcus pneumoniae 
30 polynucleotide, or fragment thereof, in a biological sample. The method involves 
contacting the biological sample with a compound or an agent capable of detecting a 
Streptococcus pneumoniae polypeptide or mRNA such that the presence of the 
Streptococcus pneumoniae polypeptide/encoding nucleic acid molecule is detected 
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in the biological sample. A preferred agent for detecting Streptococcus pneumoniae 
mRNA or DNA is a labeled or labelable oligonucleotide probe capable of hybridizing 
to Streptococcus pneumoniae mRNA or DNA. The nucleic acid probe can be, for 
example, a full-length Streptococcus pneumoniae polynucleotide of one of SEQ ID 
5 NO: 1 through SEQ ID NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, a 
complement thereof, or a fragment thereof, such as an oligonucleotide of at least 1 5, 
30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically hybridize 
under stringent conditions to Streptococcus pneumoniae mRNA or DNA. 
Alternatively, the sample can be contacted with an oligonucleotide primer of a 

10 Streptococcus pneumoniae polynucleotide of one of SEQ ID NO: 1 through SEQ ID 
NO:215 or SEQ ID NO: 431 through SEQ ID NO: 591, a complement thereof, or a 
fragment thereof, in the presence of nucleotides and a polymerase, under conditions 
permitting primer extension. 

A preferred agent for detecting Streptococcus pneumoniae polypeptide is a 

15 labeled or labelable antibody capable of binding to a Streptococcus pneumoniae 
polypeptide. Antibodies can be polyclonal, or more preferably, monoclonal. An intact 
antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used. The term "labeled 
or labelable," with regard to the probe or antibody, is intended to encompass direct 
labeling of the probe or antibody by coupling (i.e., physically linking) a detectable 

20 substance to the probe or antibody, as well as indirect labeling of the probe or 
antibody by reactivity with another reagent that is directly labeled. Examples of 
indirect labeling include detection of a primary antibody using a fluorescently labeled 
secondary antibody and end-labeling of a DNA probe with biotin such that it can be 
detected with fluorescently labeled streptavidin. The term "biological sample" is 

25 intended to include tissues, cells and biological fluids isolated from a subject, as well 
as tissues, cells and fluids present within a subject. That is, the detection method of 
the invention can be used to detect Streptococcus pneumoniae mRNA, DNA, or 
protein in a biological sample in vitro as well as in vivo. For example, in vitro 
techniques for detection of Streptococcus pneumoniae mRNA include Northern 

30 hybridizations and in situ hybridizations. In vitro techniques for detection of 
Streptococcus pneumoniae polypeptide include enzyme linked immunosorbent 
assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence. 
Alternatively, Streptococcus pneumoniae polypeptides can be detected in vivo in a 
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subject by introducing into the subject a labeled anti-Sfrepfococci/s pneumoniae 
antibody. For example, the antibody can be labeled with a radioactive marker whose 
presence and location in a subject can be detected by standard imaging techniques. 
The polynucleotides according to the invention may also be used in analytical 
5 DNA chips, which allow sequencing, the study of mutations and of the expression of 
genes, and which are currently of interest given their very small size and their high 
capacity in terms of number of analyses. 

The principle of the operation of these chips is based on molecular probes, 
most often oligonucleotides, which are attached onto a miniaturized surface, 

10 generally of the order of a few square centimeters. During an analysis, a sample 
containing fragments of a target nucleic acid to be analysed, for example DNA or 
RNA labelled, for example, after amplification, is deposited onto the DNA chip in 
which the support has been coated beforehand with probes. Bringing the labelled 
target sequences into contact with the probes leads to the formation, through 

15 hybridization, of a duplex according to the rule of pairing defined by J.D. Watson and 
F. Crick. After a washing step, analysis of the surface of the chip allows the effective 
hybridizations to be located by means of the signals emitted by the labels tagging the 
target. A hybridization fingerprint results from this analysis which, by appropriate 
computer processing, will make it possible to determine information such as the 

20 presence of specific fragments in the sample, the determination of sequences and 
the presence of mutations. 

The chip consists of a multitude of molecular probes, precisely organized or 
arrayed on a solid support whose surface is miniaturized. It is at the centre of a 
system where other elements (imaging system, microcomputer) allow the acquisition 

25 and interpretation of a hybridization fingerprint. 

The hybridization supports are provided in the form of flat or porous surfaces 
(pierced with wells) composed of various materials. The choice of a support is 
determined by its physicochemical properties, or more precisely, by the relationship 
between the latter and the conditions under which the support will be placed during 

30 the synthesis or the attachment of the probes or during the use of the chip. It is 
therefore necessary, before considering the use of a particular support, to consider 
characteristics such as its stability to pH, its physical strength, its reactivity and its 
chemical stability as well as its capacity to nonspecifically bind nucleic acids. 
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Materials such as glass, silicon and polymers are commonly used. Their surface is, 
in a first step, called "functionalization", made reactive towards the groups which it is 
desired to attach thereon. After the functionalization, so-called spacer molecules are 
grafted onto the activated surface. Used as intermediates between the surface and 
5 the probe, these molecules of variable size render unimportant the surface properties 
of the supports, which often prove to be problematic for the synthesis or the 
attachment of the probes and for the hybridization. 

Among the hybridization supports, there may be mentioned glass which is 
used, for example, in the method of in situ synthesis of oligonucleotides by 

10 photochemical addressing developed by the company Affymetrix (E.L. Sheldon, 
1993), the glass surface being activated by silane. Genosensor Consortium 
(P. Merel, 1994) also uses glass slides carrying wells 3 mm apart, this support being 
activated with epoxysilane. 

The probes according to the invention may be synthesized directly in situ on 

15 the supports of the DNA chips. This in situ synthesis may be carried out by 
photochemical addressing (developed by the company Affymax (Amsterdam, 
Holland) and exploited industrially by its subsidiary Affymetrix (United States), or 
based on the VLSIPS (very large scale immobilized polymer synthesis) technology 
(S.P.A. Fodor et a/., 1991), which is based on a method of photochemically directed 

20 combinatory synthesis. The principle of which combines solid-phase chemistry, the 
use of photolabile protecting groups and photolithography. 

The probes according to the invention may be attached to the DNA chips in 
various ways such as electrochemical addressing, automated addressing or the use 
of probe printers (T. Livache et al., 1994; G. Yershov ef a/., 1996; J. Derisi et ai, 

25 1996, and S. Borman, 1996). 

The revealing of the hybridization between the probes of the invention, 
deposited or synthesized in situ on the supports of the DNA chips, and the sample to 
be analysed, may be determined, for example, by measurement of fluorescent 
signals, by radioactive counting or by electronic detection. 

30 The use of fluorescent molecules such as fluorescein constitutes the most 

common method of labelling the samples. It allows direct or indirect revealing of the 
hybridization and allows the use of various fluorochromes. 

Affymetrix currently provides an apparatus or a scanner designed to read its 
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Gene Chip™ chips. It makes it possible to detect the hybridizations by scanning the 
surface of the chip in confocal microscopy (R.J. Lipshutz etal., 1995). 

The nucleotide sequences according to the invention may also be used in 
DNA chips to carry out the analysis of the expression of the Streptococcus 
5 pneumoniae genes. This analysis of the expression of Streptococcus pneumoniae 
genes is based on the use of chips where probes of the invention, chosen for their 
specificity to characterize a given gene, are present (D.J. Lockhart et a/., 1996; 
D.D. Shoemaker et al., 1 996). For the methods of analysis of gene expression using 
the DNA chips, reference may, for example, be made to the methods described by 

10 D.J. Lockhart et al. (1996) and Sosnowsky et al. (1997) for the synthesis of probes 
in situ or for the addressing and the attachment of previously synthesized probes. 
The target sequences to be analysed are labelled and in general fragmented into 
sequences of about 50 to 1 00 nucleotides before being hybridized onto the chip. 
After washing as described, for example, by D.J. Lockhart et al. (1996) and 

15 application of different electric fields (Sosnowsky et al., 1997), the labelled 
compounds are detected and quantified, the hybridizations being carried out at least 
in duplicate. Comparative analyses of the signal intensities obtained with respect to 
the same probe for different samples and/or for different probes with the same 
sample, determine the differential expression of RNA or copy numbers of DNA 

20 derived from the sample. 

The nucleotide sequences according to the invention may, in addition, be 
used in DNA chips where other nucleotide probes specific for other microorganisms 
are also present, and may allow the carrying out of a serial test allowing rapid 
identification of the presence of a microorganism in a sample. 

25 Accordingly, the subject of the invention is also the nucleotide sequences 

according to the invention, characterized in that they are immobilized on a support of 
a DNA chip. 

The DNA chips, characterized in that they contain at least one nucleotide 
sequence according to the invention, immobilized on the support of the said chip, 
30 also form part of the invention. 

The chips will preferably contain several probes or nucleotide sequences of 
the invention of different length and/or corresponding to different genes so as to 
identify, with greater certainty, the specificity of the target sequences or the desired 
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mutation in the sample to be analysed. 

Accordingly, the analyses carried out by means of primers and/or probes 
according to the invention, immobilized on supports such as DNA chips, will make it 
possible, for example, to identify, in samples, mutations linked to variations such as 
5 intraspecies variations. These variations may be correlated or associated with 
pathologies specific to the variant identified and will make it possible to select the 
appropriate treatment. 

The invention thus comprises a DNA chip according to the invention, 
characterized in that it contains, in addition, at least one nucleotide sequence of a 

10 microorganism different from Streptococcus pneumoniae, immobilized on the support 
of the said chip; preferably, the different microorganism will be chosen from an 
associated microorganism, a bacterium of the Streptococcus family, and a variant of 
the species Streptococcus pneumoniae. 

The principle of the DNA chip as explained above, may also be used to 

15 produce protein "chips" on which the support has been coated with a polypeptide or 
an antibody according to the invention, or arrays thereof, in place of the DNA. These 
protein "chips" make it possible, for example, to analyse the biomolecular interactions 
(BIA) induced by the affinity capture of target analytes onto a support coated, for 
example, with proteins, by surface plasma resonance (SPR). Reference may be 

20 made, for example, to the techniques for coupling proteins onto a solid support which 
are described in International Application EP 524 800 or to the methods describing 
the use of biosensor-type protein chips such as the BIAcore-type technique 
(Pharmacia) (Arlinghaus et ai, 1997, Krone et al., 1997, Chatelier et a/., 1995). 
These polypeptides or antibodies according to the invention, capable of specifically 

25 binding antibodies or polypeptides derived from the sample to be analysed, may thus 
be used in protein chips for the detection and/or the identification of proteins in 
samples. The said protein chips may in particular be used for infectious diagnosis 
and may preferably contain, per chip, several polypeptides and/or antibodies of the 
invention of different specificity, and/or polypeptides and/or antibodies capable of 

30 recognizing microorganisms different from Streptococcus pneumoniae. 

Accordingly, the subject of the present invention is also the polypeptides and 
the antibodies according to the invention, characterized in that they are immobilized 
on a support, in particular of a protein chip. 
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The protein chips, characterized in that they contain at least one polypeptide 
or one antibody according to the invention immobilized on the support of the said 
chip, also form part of the invention. 

The invention comprises, in addition, a protein chip according to the invention, 
5 characterized in that it contains, in addition, at least one polypeptide of a 
microorganism different from Streptococcus pneumoniae or at least one antibody 
directed against a compound of a microorganism different from Streptococcus 
pneumoniae, immobilized on the support of the chip. 

The invention also relates to a kit or set for the detection and/or the 
10 identification of bacteria belonging to the species Streptococcus pneumoniae or to an 
associated microorganism, or for the detection and/or the identification of a 
microorganism characterized in that it comprises a protein chip according to the 
invention. 

The present invention also provides a method for the detection and/or the 

15 identification of bacteria belonging to the species Streptococcus pneumoniae or to an 
associated microorganism in a biological sample, characterized in that it uses a 
nucleotide sequence according to the invention. 

The invention also encompasses kits for detecting the presence of a 
Sfrepfococcus pneumoniae polypeptide in a biological sample. For example, the kit 

20 comprises reagents such as a labeled or labelable compound or agent capable of 
detecting Streptococcus pneumoniae polypeptide or mRNA in a biological sample; 
means for determining the amount of Streptococcus pneumoniae polypeptide in the 
sample; and means for comparing the amount of Streptococcus pneumoniae 
polypeptide in the sample with a standard. The compound or agent is packaged in a 

25 suitable container. The kit further comprises instructions for using the kit to detect 
Streptococcus pneumoniae mRNA or protein. 

In certain embodiments, detection involves the use of a probe/primer in a 
polymerase chain reaction (PCR) (see, e.g. U.S. Patent 4,683,195 and U.S. Patent 
4,683,202), such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain 

30 reaction (LCR). This method includes the steps of collecting a sample of cells from a 
patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the 
sample, contacting the nucleic acid sample with one or more primers which 
specifically hybridize to a Streptococcus pneumoniae polynucleotide under conditions 
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such that hybridization and amplification of the Streptococcus pneumoniae- 
polynucleotide (if present) occurs, and detecting the presence or absence of an 
amplification product, or detecting the size of the amplification product and 
comparing the length to a control sample. 
5 All patents and publications cited herein are hereby incorporated by 

reference. 

Examples 

The following examples are carried out using standard techniques, which are 
10 well known and routine to those of skill in the art, except where otherwise described 
in detail. The following examples are presented for illustrative purpose, and should 
not be construed in any way limiting the scope of this invention. 

Example 1 

1 5 bloinformatics and gene mining of streptococcus pneumoniae 

The genomic sequence of Streptococcus pneumoniae was downloaded from 
The Institute for Genomic Research (TIGR) website and novel open reading frames 
(ORFs) were determined in the following manner. An ORF was defined as having 
one of three potential start site codons, ATG, GTG or TTG and one of three potential 

20 stop codons, TAA, TAG or TGA. The inventors used a unique set of two ORF finder 
algorithms: GLIMMER (Salzberg et a/., 1998) and inventors' assignee's program to 
enhance the efficiency for finding "all" ORFs. In order to evaluate the accuracy of the 
ORFs determined, a program developed by inventors' assignee called DiCTion was 
employed that uses a discrete mathematical cosine function to assign a score for 

25 each ORF. An ORF with a DiCTion score > 1.5 is considered to have a high 
probability of encoding a protein product. The minimum length of an ORF predicted 
by the two ORF finding algorithms was set to 225 nucleotides (including stop codon) 
which would encode a protein of 74 amino acids. As a final search for remnants of 
ORFs, all noncoding regions > 75 nucleotides were searched against the public 

30 protein databases (described below) using tBLASTn. This helped to identify regions 
of genes that contain frameshifts (Mejlhede et al., 1999) or fragments of genes that 
might have a role in causing antigenic variation (Fraser et al., 1997). A graphical 
analysis program developed by inventors' assignee also allowed the inventors to see 
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all six reading frames and the location of the predicted ORFs relative to the genomic 
sequence for further inspection. This helped to eliminate those ORFs that have large 
overlaps with other ORFs, although there are known cases of ORFs being totally 
embedded within other ORFs (Loessner et al., 1999; Hernandez-Sanchez et al., 
5 1998). 

The initial annotation of the Streptococcus pneumoniae ORFs was performed 
using the BLAST (v. 2.0) Gapped search algorithm, Blastp, to identify homologous 
sequences (Altschul et al., 1997). A cutoff 'e' value of anything < e" 10 was considered 
significant. Other search algorithms such as FASTA or PSI-BLAST were used as 

10 needed. The non-redundant protein sequence database used for the homology 
searches consisted of GenBank, SW1SS-PROT (Bairoch and Apweiler, 2000), PIR 
(Barker et al., 2001), and TREMBL (Bairoch and Apweiler, 2000) database 
sequences updated daily. ORFs with a Blastp result of > e" 10 were considered to be 
unique to Streptococcus pneumoniae. 

15 A keyword search of the entire BLAST results was carried out using known or 

suspected target genes for immunogenic compositions, as well as words that 
identified the location of a protein or function. 

Several parameters were used to determine grouping of the predicted 
proteins. Proteins destined for translocation across the cytoplasmic membrane 

20 encode a leader signal (also called signal sequence) composed of a central 
hydrophobic region flanked at the N-terminus by positively charged residues 
(Pugsley, 1993). A program, called SignalP, identifies signal peptides and their 
cleavage sites (Nielsen et al., 1997). To predict protein localization in bacteria, the 
software PSORT has been used (Nakai and Kanehisa, 1991). This program uses a 

25 neural net algorithm to predict localization of proteins to the 'cytoplasm', 'periplasm', 
and 'cytoplasmic membrane' for Gram-positive bacteria as well as 'outer membrane' 
for Gram-negative bacteria. Transmembrane (TM) domains of proteins have been 
analyzed using the software program TopPred II (Cserzo etal., 1997). 

The Hidden Markov Model (HMM) Pfam database of multiple alignments of 

30 protein domains or conserved protein regions (Sonnhammer et al., 1997) was used 
to identify Streptococcus pneumoniae proteins that may belong to an existing protein 
family. Keyword searching of this output was used to help identify additional 
candidate ORFs that may have been missed by the BLAST search criteria. A 
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computer algorithm, called HMM Lipo, was developed by inventors' assignee to 
predict lipoproteins using approximately 131 biologically proven bacterial lipoproteins. 
This training set was generated from experimentally proven prokaryotic lipoproteins. 
The protein sequence from the start of the protein to the cysteine amino acid plus the 
5 next two additional amino acids was used to generate the HMM. Using 
approximately 70 known prokaryotic proteins containing the LPXTG cell wall sorting 
signal, a HMM (Eddy, 1996) was developed to predict cell wall proteins that are 
anchored to the peptidoglycan layer (Mazmanian et a/., 1999; Navarre and 
Schneewind, 1999). The model used not only the LPXTG sequence but also 

10 included two features of the downstream sequence, first the hydrophobic 
transmembrane domain and secondly, the positively charged carboxy terminus. 
There are also a number of proteins that interact, non-covalently, with the 
peptidoglycan layer and are distinct from the LPXTG protein class described above. 
These proteins seem to have a consensus sequence at their carboxy terminus 

15 (Koebnik, 1995). The inventors' assignee has also developed and used a HMM of 
this region to identify any Streptococcus pneumoniae that may fall into this class of 
proteins. 

The proteins encoded by Streptococcus pneumoniae identified ORFs were 
also evaluated for other useful characteristics. A tandem repeat finder (Benson, 

20 1999) identified ORFs containing repeated DNA sequences such as those found in 
MSCRAMMs (Foster and Hook, 1998) and phase variable surface proteins of 
Neisseria meningitidis (Parkhill et a/., 2000). Proteins that contain the Arg-Gly-Asp 
(RGD) attachment motif, together with integrins that serve as their receptor, 
constitute a major recognition system for cell adhesion. RGD recognition is one 

25 mechanism used by microbes to gain entry into eukaryotic tissues (Stockbauer et al., 
1999; Isberg and Tran Van Nhieu, 1994). However, not all RGD containing proteins 
mediate cell attachment. It has been shown that RGD containing peptides with a 
proline at the carboxy end (RGDP) are inactive in ceil attachment assays 
(Pierschbacher and Ruoslahti, 1987) and are excluded. The Geanfammer software 

30 was used to cluster proteins into homologous families (Park and Teichmann, 1998). 
Preliminary analysis of the family classes has provided novel ORFs within a 
candidate cluster as well as defining potential protein function. 
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Example 2 
Cloning, Expression and Analysis 
of Predicted ORF Proteins 

5 Materials and Methods 

Growth of Streptococcus pneumoniae . Streptococcus pneumoniae were 
grown in Todd Hewitt broth (Difco) supplemented with 0.5% yeast extract. Bacteria 
were incubated at 35°C in 5% C0 2 without shaking. Mid-log phase cultures (OD 550 
approx 0.3) were harvested after approximately 4 hours incubation and cells pelleted 

10 by centrifugation (5,000 x g) at 4°C. 

Cloning and expression of predicted ORFs . The predicted ORFs were cloned 
and expressed in E. coli Top10 or BLR(DE3). Expression of each ORF was tested in 
both pBAD/Thio-TOPO (which contains an arabinose inducible promoter) and pCR- 
T7/NT-TOPO expression systems (Invitrogen, Carlsbad, CA). Gene specific primers 

15 were designed to amplify, by polymerase chains reaction (PCR), each selected ORF 
from Streptococcus pneumoniae CP1200 (Morrison et a/., 1983) genomic DNA 
purified using the Wizard Genomic DNA purification kit (Promega, Madison, Wl). The 
5' primers were designed to exclude the predicted signal sequence (as predicted by 
SignalP) and the 3' primer was designed to either include the stop codon (pCR-T7) or 

20 exclude the stop codon (pBAD). ORFs were amplified in a standard polymerase 
chain reaction (200 uM each dNTP (Invitrogen), 200 uM each 5' and 3' gene specific 
primer, 1 juL stock of chromosomal DNA, 2.5U Pfu Turbo polymerase (Stratagene, 
LaJolla, CA) and 1x Pfu Turbo reaction buffer in a total volume of 50 uL). 
Overhanging As were added to the PCR products by incubation for 10 minutes at 

25 72°C with 1U of Taq DNA polymerase (Roche Diagnostics, Indianapolis, IN). PCR 
products were cloned into the expression vectors and transformed into E. coli TOP10 
following manufacturer's TOPO-TA cloning protocol (Invitrogen). Positive clones 
were identified by PCR using one gene specific primer and one vector specific primer 
to ensure correct orientation. 

30 ORFs cloned into pCR-T7 were transformed into E. coli BL21 (DE3) for protein 

expression using the T7 promoter and those cloned into pBAD were kept in TOP10. 
Protein expression was determined by growing overnight cultures of the positive 
clones in 2 mL HySoy broth (DMV International Nutritional, Fraser, NY) 



-111- 



WO 02/083855 



PCT/US02/11524 



supplemented with 100 ug/mL ampicillin. These cultures were then diluted 1:100 into 
fresh media and grown until OD 600 = 1 .0. Protein expression was induced with either 
2% arabinose (pBAD) or 0.1 mM IPTG (pCRT7). Three hours post-induction, the 
cells were harvested and protein expression determined by Western blot analysis of 
5 whole-cell lysates using either anti-express epitope (pCRT7) or anti-thio (pBAD) 
antibodies. The best expressing clone (pBAD or pCRT7) was used for protein 
production and purification. 

Fourteen of the ORFs that did not express in either pCRT7 or pBAD were 
cloned into pET27b(+) (Novagen, Madison, Wl). The ORFs were again amplified by 
10 PGR and cloned using standard molecular biology techniques into the Ncol and Xhol 
sites of pET27b(+). Clones were again screened by PCR, and plasmids with the 
correct insert were transformed into BL21 (DE3) and expression tested as described 
for pCR-T7. Protein expression was determined by Western blot analysis using anti- 
HSV epitope antibody. 

15 Purification of Soluble His-taq ORF Proteins . Protein was expressed from 

positive clones in 4 x 1L of media as described above. Cells were harvested by 
centrifugation, resuspended in 100 mL of Ni Buffer A (20mM Tris, pH 7.5, 150 mM 
NaCI) and lysed by 2 passages through a French pressure cell at 16,000 psi (SLM 
Instruments, Inc., Rochester, NY). 

20 For soluble proteins, the cell debris was pelleted by centrifugation at -9,000 x 

g and the supernatant was loaded onto an iminodiaceticacid sepharose 6B (Sigma 
Chemical, St. Louis, MO) column charged with Ni 2+ . Unbound proteins were washed 
from the column with Ni buffer A until A 28 o of eluate reached a baseline. The bound 
protein was then eluted with Ni buffer A containing 300 mM imidazole (Sigma 

25 Chemical). Purity was estimated by SDS-PAGE. 

Samples requiring further purification were concentrated and buffer 
exchanged over a PD-10 column (Amersham-Pharmacia Biotech, Piscataway, NJ) 
equilibrated with buffer A (20 mM Tris, pH 8.0). The eluate was loaded onto a Q- 
sepharose High Performance (Amersham-Pharmacia Biotech) column and eluted 

30 with a 0-35% Buffer B (20 mM Tris, pH 8.0, 1M NaCI) gradient. Protein-containing 
fractions were determined by SDS-PAGE. All protein purification was done using an 
AKTA Explorer (Amersham-Pharmacia Biotech). 
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Isolation and Solubilization of Insoluble His-tag fusion proteins . Bacterial cell 
pellets were suspended at a ratio of 5:1 (buffer volume:pellet wet weight) in 10 mM 
NaPO4/150mM NaCI/pH 7.0 with Complete Protease Inhibitor Cocktail containing 
EDTA (Roche Diagnostics GmbH, Mannheim, Germany). The cells were disrupted 
5 using a Microfluidizer (Microfluidics Corp., Newton, MA) and centrifuged at 21,900 x 
g for 30 minutes at 4°C. The pellet, containing insoluble His-tag proteins, was 
subjected to a series of detergent extractions followed by a final solubilization step 
using 6M urea. The pellet was resuspended in 10 mM NaPOVISO mM NaCI/pH 7.0 
containing Complete Protease Inhibitor Cocktail and 1.0% Triton X-100 (TX-100) 

10 using the same 5:1 ratio described above. The suspension was stirred at 4°C for 30 
minutes and centrifuged at 21 ,900 x g for 20 minutes at 4°C. The supernatant was 
removed and stored at 4°C for further analysis. The pellet was subjected to a second 
TX-100 extraction, as described, and the supernatant removed and stored at 4°C for 
further analysis. The TX-100 pellet was then resuspended in 10 mM NaPO4/150 mM 

15 NaCI /pH 7.0 containing Complete Protease Inhibitor Cocktail and 1.0% Zwittergent 
3-14 (Z3-14) and stirred at 4°C for a minimum of 1 hour. The suspension was 
centrifuged at 21,900 x g for 20 minutes at 4°C. The supernatant was removed and 
stored at 4°C for further analysis. The Z3-14 pellet was resuspended in 100 mM 
Tris-HCI/6M urea/pH 8.0 and stirred a minimum of 4 hours at room temperature. The 

20 suspension was centrifuged at 21 ,900 x g for 20 minutes at 4°C and the supernatant 
stored at 4°C for further analysis. 

Purification of Solubilized His-tag fusion proteins . Isolated extracts containing 
His-tag fusion proteins were identified as described by SDS-PAGE and/or Western 
blot analysis. Chromatography was carried out using POROS MC 20 micron metal 

25 chelate Ni 2+ media (Perseptive Biosystems, Framingham, MA) prepared according to 
the manufacturer. Protein extracts were loaded at approximately 5-10 mg of total 
protein per mL of column media. 

For preparations in which the His-tag proteins were soluble in either the 
cytosolic fraction or detergent extractions by TX-100 or Z3-14, the material was 

30 applied directly to a MC 20 column equilibrated with a minimum of 3 column volumes 
of 10mM NaPO4/150 mM NaCI/pH 7.0 for cytosolic proteins, or the same buffer 
containing either 1.0% TX-100 or 1.0% Z3-14 for proteins isolated in the TX-100 and 
Z3-14 extractions respectively. For cytosolic material, unbound proteins were 
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washed through the column with a minimum of 5 column volumes of equilibration 
buffer. For TX-100 or Z3-14 containing extracts, unbound proteins were washed 
through the column with equilibration buffer containing either 0.05% TX-100 or Z3-14, 
depending on the solubility characteristics of the particular protein. His-tag fusion 
5 proteins were eluted using a step gradient of 2 column volumes each of 25 mM, 50 
mM, 125 mM, and 250 mM imidazole in 10mM NaPO^O mM NaCI/pH 7.0 
containing either 0.05% TX-100 or 0.05% Z3-14. Fractions containing His-tag protein 
were identified by SDS-PAGE and pooled. Imidazole was removed by dialysis into 
an appropriate buffer. Protein concentration was determined by BCA assay (Pierce) 

10 and, if necessary, preparations were concentrated by either ultrafiltration using 
Centriprep YM-10 membranes (Millipore, Bedford, MA) or by applying the material to 
a smaller MC 20 column, under the conditions described, and eluting with 250 mM 
imidazole followed by dialysis. Protein purity was estimated by SDS-PAGE and 
scanning densitometry. 

15 For preparations in which urea was used to denature and solubilize the 

protein, the material was diluted 3 fold with 100 mM Tris-HCI/0.05% TX-100/pH 7.5 to 
give a final urea concentration of 2 M. The material was applied to a MC 20 column 
equilibrated with a minimum of 3 column volumes of 100 mM Tris-HCI/0.05% TX- 
100/2 M urea/pH 7.5 and unbound proteins were washed through the column with a 

20 minimum of 5 column volumes of equilibration buffer. His-tag fusion proteins were 
eluted using a step gradient of 2 column volumes each of 25 mM, 50 mM, 125 mM, 
and 250 mM imidazole in 100 mM Tris-HCI/0.05% TX-100/2 M urea pH 7.5. 
Fractions containing His-tag protein were identified by SDS-PAGE and pooled. 
Imidazole and urea were removed, and the protein refolded by dialysis into an 

25 appropriate buffer containing 0.05% TX-100. If necessary, preparations were 
concentrated by either ultrafiltration using Centriprep YM-1 0 membranes (Millipore, 
Bedford, MA) or by applying the material to a smaller MC 20 column, under the 
conditions described, and eluting with 250 mM imidazole followed by dialysis. 
Protein purity was estimated by SDS-PAGE and scanning densitometry. 

30 SDS-PAGE & Western Analysis . SDS-PAGE was carried out as described by 

Laemmli (Laemmli, 1970), using 10-20% (wt/vol) gradient acrylamide gels (Zaxis, 
Hudson, OH). Proteins were visualized by staining the gels with Simply Blue 
Safestain (Invitrogen Life Technologies.Carlsbad, CA). The gels were scanned with 
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a Personal Densitometer SI (Molecular Dynamics Inc., Sunnyvale, CA) and purities 
were estimated using the Image Quant software (Molecular Dynamics Inc.). 

Transfer of proteins to polyvinylidene difluoride (PVDF) membranes was 
accomplished with a semidry electroblotter and electroblot buffers (Owl Separation 
5 Systems, Portsmouth, NH). The PVDF membrane, containing the transferred 
protein, was blocked with 5 % non-fat dry milk prepared in PBS (Blotto) for 30 
minutes. The membrane was then probed with one of the following primary antibody 
preparations at the indicated dilution specific for the individual protein expression 
system: Invitrogen anti-Xpress (1:5000), Invitrogen anti-thioredoxin (1:2000), 

10 Novagen anti-HSV epitope (1:5000), Qiagen anti-4X His (1:5000). The membrane 
was then washed with Blotto followed by Goat anti-mouse alkaline phosphatase 
conjugate (1:1500) as the secondary antibody (Biosource International, Camarillo, 
CA). Western blots were developed with 5-bromo-4-chloro-indolylphosphate- 
nitroblue tetrazolium (BCIP/NBT) phosphatase substrate system (Kirkegaard and 

15 Perry Laboratories, Gaithersburg, MD). 

Protein quantitation . Protein concentrations were estimated by the 
bicinchoninic assay (Pierce, Rockford, IL) with bovine serum albumin as the 
standard. 

Production of anti-ORF sera in mice . Female Swiss Webster mice (Taconic 
20 Farms, Germantown, NY) with ages 6 to 8 weeks old were immunized 
subcutaneously in the neck at weeks 0, 4, and 6 weeks with purified His tag protein. 
Two separate immunogenic compositions were prepared with each His-tag protein. 
One immunogenic composition was prepared with the protein formulated with 
STIMULON™ QS-21 and a second was prepared with the protein formulated with 
25 MPL™. Each dose for one group of mice contained 10 ug of purified protein and 20 
ug STIMULON™ QS-21, while each dose for the second group of mice contained 10 
ug of the same protein and 50 ug MPL™. Serum samples were collected at weeks 0, 
4, 6 and 8. Mice were housed in a specific-pathogen free facility and provided water 
and food ad-libitum. 

30 Pneumococcal whole-cell ELISAs . Streptococcus pneumoniae strains, either 

type 3 or type 14, were grown in Todd Hewitt broth (Difco) containing 100 ug/ml 
streptomycin at 35°C without shaking. The bacteria were grown to mid-log phase 
(OD 550 <1 .0), and heat inactivated for 1 hour at 60°C. Bacteria were pelleted at 
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10,000 x g and resuspended in PBS to an OD 550 = 0.1. Fifty-five pi of this 
suspension was then added to each well of 96-weII Nunc plates and air dried at room 
temperature. Plates were stored at 4°C until used. 

Wells were blocked with 150 pi/well of PBS containing 5% (wt/vol) dry milk 
5 (blocking buffer) for 1 hour. Wells were washed 5 times with PBS in a Skantron 
washer, and mouse sera diluted in blocking buffer (100 pl/well) added. Plates were 
incubated at room temperature for 2 hours and unbound antibodies removed by 
washing 5 times with PBS in a Skantron washer. Bound antibodies were detected 
with 100 Ml/well of peroxidase-labeled goat anti-mouse IgG (1:1,000 dilution of 

10 1 mg/ml in PBS; KPL) at room temperature for 2 hours. Plates were washed with PBS 
as above, and developed with 100 pl/well ABTS (KPL) for 25 minutes at room 
temperature. The reactions were stopped with 100 ul/well of 1% SDS and the OD 40 3 
of each well read on a VERSAmax microplate reader (Molecular Devices Corp., 
Sunnyvale, Calif.). Endpoint titers of each test serum were calculated as the inverse 

15 of the highest mean dilution giving an OD 405 = 0.1. 

FACS analysis of Streptococcus pneumoniae . Strains type 3 and 19F were 
grown in Todd-Hewitt broth + 0.5% yeast extract from frozen stocks of OD 60 o~1 0 
cells. Incubation was at 37°C for 3 to 4 hours without shaking. 2-3x1 0 7 cells, 100 pi 
of OD 600 =0.5 fortype3, and 50 pi for 19F, were pipetted into a 96-well microtiter plate 

20 and spun at 4000 rpm in an Eppendorf tabletop centrifuge for 5 minutes. 
Supernatant was aspirated and cells were resuspended in 95 pi PBS-0.5%BSA-0.1% 
gelatin. Five pi primary antibody was added, mixed and left incubating on ice for 1 
hour. Cells were pelleted as before, washed twice with 100 pi buffer and 
resuspended in 99 pi buffer. One p! goat anti-mouse secondary antibody conjugated 

25 to Alexa Fluor 488 (Molecular Probes, Eugene, OR) was added to the samples, 
mixed and left incubating on ice for 30 minutes. Cells were washed as before and 
resuspended in 100 pi buffer. Before analyzing on the FACSVantageSE unit, 
samples were diluted to 1 ml with buffer. Samples were read on a Becton Dickinson 
FACSVantage unit with an Enterprise li laser. Excitation was at 488nm and emission 

30 was detected with a photomultiplier tube using a 530/30 filter. Week 0 antisera were 
run as background control for the week 8 antisera. 

Comparison of message from cells grown in vitro and in vivo . Messenger 
RNA (mRNA) levels for specific transcripts can be examined by creating a double 
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stranded cDNA from the mRNA using reverse transcriptase. This cDNA is then 
amplified using standard PCR conditions. The resulting amplification products are 
thus indicative of the message produced. This technique is useful for comparing the 
expression of specific transcripts under varying environmental conditions, such as 
5 growth in culture flasks versus growth in vivo. 

Preparation of RNA from cells grown in vitro . In vitro grown Streptococcus 
pneumoniae serotypes were grown to log phase in 60 ml THB -0.5%YE at 37°C with 
5% C0 2 . Bacterial cells were harvested by centrifugation at 1000 x g for 15 minutes 
at 4°C. The supernatant was aspirated and the cells were resuspended in 1ml 

10 RNAIater (Ambion, Austin, TX) and stored for >1 hour at 4°C. The cells were then 
centrifuged in a microfuge for 5 minutes at 8000 x g. The supernatant was aspirated 
and the cells were resuspended in 100 pi 10% deoxycholate (DOC). 1100 pi of 
RNAZOL B (Tel-Test, Inc) were then added and the suspension mixed briefly by 
inversion. 120 pi of CHCI 3 were then added, the sample mixed by inversion and then 

15 centrifuged in a microfuge at full speed for 10 minutes at 4°C . The aqueous layer 
was removed and the RNA was precipitated by addition of an equal volume of 2- 
propanol. The RNA was incubated at 4°C for >1 hour and then centrifuged in a 
microfuge at full speed for 10 minutes at room temperature. The supernatant was 
aspirated and the RNA was washed with 75% ETOH and recentrifuged for 5 minutes. 

20 The supernatant was aspirated and the RNA was resuspended in 50-100 pi 
nuclease- free water. DNA was removed from the RNA by treating the sample with 
RNAse-free DNAase (DNA FREE, Ambion) for 20 minutes at 37 °C, followed by 
inactivation of the enzyme by addition of the DNA FREE chelator. The purity and 
yield of the RNA was assessed by measuring the absorbance at 260 nm and 280 

25 nm. Absorbance ratios were typically 1 .9-2.0. RNA was stored at -70°C. 

Preparation of RNA from cells grown in vivo . In vivo grown Streptococcus 
pneumoniae serotypes were harvested from sealed dialysis tubing incubated in the 
peritoneal cavities of Sprague-Dawley rats as described by Orihuela et al. (2000). 
Log phase Streptococcus pneumoniae cells were prepared as described above and 

30 resuspended to 10 6 cfu/ml in RPMI media (Celltech) supplemented with 0.4% 
glucose. One ml of the cell suspension was sealed in a PVDF dialysis membrane 
with a 80,000 M w cutoff (SprectraPor). Two such bags were implanted 
intraperitoneally in 400g Sprague Dawley rats (Taconic). The bags remained in the 
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rats for 22 hours, after which the rats were terminated and the bags were harvested. 
RNA was prepared from the intraperitoneally grown cells as described above. 

RT-PCR to examine message levels . Specific message for each candidate 
gene was amplified out from RNA prepared from both in vitro and in vivo grown cells 
5 using RT-PCR. For each reaction, 0.5 ug RNA was incubated with 0.25 pM 0 f the 
reverse mining primer for 3 minutes at 75°C, then cooled on ice and transferred to 
44°C. The message was reverse transcribed using the RETROscript (Ambion) kit 
according to the manufacturer's directions. ReddyMix (ABgene) was used according 
to the manufacturer's directions to amplify each message from 2-5 pi of the sample, 
10 using 0.25 uM of the above reverse primer and the forward mining primer. Following 
amplification, 10 pi of the amplified product was electrophoresed on a 1% agarose 
gel. 

Results 

15 Cloning of ORFs into expression vectors. Fifty-nine ORFs were selected 

for cloning and expression based on prediction of surface exposure from genomic 
analysis as described above. These ORFs were amplified by PCR and cloned into 
the expression vectors as described in Materials and Methods. The ORFs were 
cloned into pBAD/Thio-TOPO and pCR-T7/NT-TOPO. Both vectors fuse a 

20 hexahistidine tag and a unique epitope to facilitate purification and identification by 
western blot respectively. The pBAD vector also fuses a thioredoxin moiety to the 
cloned protein to enhance solubility. 

Expression of ORFs in E. coli. The genes encoding all 59 ORFs were 
induced in the appropriate host E. coli strains and examined for expression by SDS- 

25 PAGE and western blot analysis of whole cell extracts. Of the 59 ORFs, a total of 24 
(41%) were expressed at detectable levels. Fourteen of the ORFs that did not 
express in either of the expression vectors were cloned into pET27b(+) which fuses a 
hexahistidine tag to the C-terminus and a PelB leader sequence at the N-terminus of 
the protein. One of the 14 ORFs cloned into pET27b(+) expressed protein. 

30 Purification of Expressed ORF Proteins. All of the expressed ORFs 

contained a 6X His motif to aid in purification. Initial purification of all of the proteins 
was done using a Ni containing resin according to manufacturer's directions. Twenty 
of the expressed ORF proteins were purified to acceptable levels of homogeneity for 
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immunization studies using this affinity purification (Table 17). Specific purification 
conditions used are detailed in Materials and Methods and in Table 17. Thirteen of 
the 20 ORF proteins were used to immunize mice and obtain antisera specific for the 
expressed protein. 

5 

Table 17 

Purification of Expressed S. pneumoniae ORF Proteins 



ORF# 


[Protein] 
mg/ml 


Total 
Protein 
mg 


Purity 

% 


Final Buffer 


"PSORT" 
PREDICTED 
Location 


Location in 
E. coli 


75 


0.52 


6.8 


94% 


PBS/1 mM EDTApH 


Outer 
membrane 


Cytosol 


2615 


0.42 


16.8 


80% 


PBS/1 mM EDTApH 
7.4 


Outer 
membrane 


Cytosol 




0.53 
(0.14) 






0.1MTris/150mM 
NaCI/ 
0.05%Zw3-14/1mM 
EDTA pH 8.0 


Outer 
membrane 


Inclusion 
Bodies 


1143 


1.4 


196 


92% 


PBS/0.05%fx-100/ 
1mM EDTApH 7.4 


Inner 
membrane 


Inclusion 
Bodies 


1835 


0.5 
(0.2) 


10.5 


91.3% 


PBS/0.05%tx-100/ 
1mM EDTApH 7.4 


Inner 
membrane 


Inclusion 
Bodies 


1568 


1.0 


5.0 


>85% 


PBS/0.05%tx-100/ 
1mM EDTApH 7.4 


Inner 
membrane 


inclusion 
Bodies 


2271 


4.9 


122.5 


>90% 


PBS, pH 7.4 


Inner 
Membrane 


Cytosol 


2621 


1.5 


4.5 


>90% 


PBS, pH 7.4 


Inner 
Membrane 


Cytosol 


1104 


2.0 




85- 
90% 


PBS, pH 7.4 


Outer 
Membrane 


Cytosol 


935 


0.1 


.5 


85% 


50mM Glycine- 
NaOH/150mM 

NaCI/ 
0.05%Z3-14 pH 
10.0 


Outer 
membrane 


Inclusion 
Bodies 
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3361 


1.67 


3.34 


98% 


PBS/1 mM EDTApH 
7.4 


Inner 
membrane 


Cytosol 


339 


0.91 
(0.91) 


127.4 
(27.3) 


93.2% 
(80.8% 
) 


PBS/0.05%tx-100/ 
1mM EDTApH 7.4 


Inner 
Membrane 


Inclusion 
Bodies 


2322 


0.55 
(0.23) 


2.5 
(0.92) 


90% 


BS/0.05%tx-100/ 
1mM EDTApH 7.4 


Inner 
Membrane 


Inclusion 
Bodies 


1476 


1.2 
(0.6) 


9.6 


>80% 


PBS/0.05%tx-100/ 
1mM EDTApH 7.4 


Inner 
Membrane 


Inclusion 
Bodies 


3115 


0.2 
(0.5) 


2.8 


>85% 


PBS/0.05%tx-100/ 
1mM EDTApH 7.4 


Inner 
Membrane 


Inclusion 
Bodies 


132 


4.6 


460 


95% 


PBS pH 7.4 


- 


Cytosol 


3386 


3.1 


27 


85% 


PBS pH 7.4 


Inner 
Membrane 


Cytosol 


2112 


0.6 


1.8 


85% 


. PBS pH 7.4 


Inner 
Membrane 


Cytosol 


916 


0.26 


1.3 


>85% 


PBS 0.05%Tx-100 
pH 7.4 




Inclusion 
Bodies 


3373 


0.97 


1.9 


84% 


PBS 0.05% Z3-14 
pH 7.4 


Inner 
Membrane 


Inclusion 
Bodies 



Expression of ORF proteins in Streptococcus pneumoniae whole cell 
lysates. To determine if the ORFs are being expressed in Streptococcus 
5 pneumoniae, whole cell lysates of in vitro grown cells were probed with the antisera 
in Western blot analysis. Each antiserum was reactive with the purified recombinant 
protein as a positive control (data not shown). Whole cell lysates from Streptococcus 
pneumoniae strains type 3, type 14, and type 19F were examined in Western blot, 
and the results are summarized in Table 18. Proteins from three of the ORFs were 

10 undetectable or barely detectable in all of the strains tested. Proteins from eight of 
the ORFs were expressed in at least 2 of the strains, while proteins from two ORFs 
were detected in only one of the three strains examined. These results demonstrate 
that the majority of the proteins from these ORFs were expressed in late log, early 
stationary phase Streptococcus pneumoniae, and that some strains may not express 

15 detectable amounts of each ORF at the time point examined. 
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Table 18 

Whole Cell ELISA and Western Blot Expression Data for S. pneumoniae ORFs 





Whole Cell ELISA 


Western Blot Expression In 
vitro 


FACS Analysis 


Vaccine 
(10 ng) 


Adjuvant 
(20 »g) 


Type 3 


Type 14 


Type 3 


Type 14 


Type 19F 


Type 3 


Type 
19F 


2615 


QS21 


<200 


<200 












3039 


QS21 


<200 


<200 


+ 


++ 


++ 






75 


QS21 


256 


<200 


+++ 


+++ 


+++ 






1568 


QS21 


4,018 


<200 


++ 


+++ 


+++ 






1143 


QS21 


779 


<200 












1835 


QS21 


202 


<200 




+/- 








2271 


QS21 


442 


<200 


+++ 


+++ 


+++ 






2621 


QS21 


739 


<200 


++ 










1104 


QS21 


409 


<200 


+++ 


+++ 


+++ 






339 


QS21 


<200 


<200 




+/- 






ND 


2322 


QS21 


<200 


<200 






+/- 




ND 


3361 


QS21 


<200 


<200 




+ 




+ 


ND 


935 


QS21 


<200 


<200 










ND 


Standard 




-45,000 


-10,000 


ND 


ND 


ND 







5 Surface exposure of ORF proteins: Whole Cell ELISA. The 13 antisera 

against the recombinant ORF proteins were tested for surface reactivity by whole cell 
ELISA against two strains of Streptococcus pneumoniae, type 3 and type 14. The 
results are shown in Table 18. Seven of the 13 antisera gave detectable whole cell 
titers against type 3 Streptococcus pneumoniae, while none of them gave detectable 

10 titers against the type 14 strain. When anticapsular serum was tested against the 
homologous capsular serotype, the titer against the type 14 strain was much lower 
than that against the type 3 strain (see row labeled "standard" in Table 18). This 
result indicated that there might have been sensitivity issues with the type 14 whole 
cell ELISA, because the Western blot data clearly demonstrate that type 14 

15 Streptococcus pneumoniae do express the majority of the proteins of the ORFs 
(Table 18). The whole cell ELISA titers of antiserum against the proteins of ORF 75 
(SEQ ID NO:218), ORF 1104 (SEQ ID NO:282), ORF 2621 (SEQ ID NO:363), ORF 
1568 (SEQ ID NO:306), ORF 1143 (SEQ ID NO:285), ORF 2271 (SEQ ID NO:343), 
and ORF 1835 (SEQ ID NO:315) ranged from slightly above background to 20 times 
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above background. These results indicate that these antisera detect at least some 
surface exposed epitopes for these ORFs. 

Surface exposure of ORF proteins: FACS Analysis. The polyclonal 
antisera against the proteins from ORFs 2615, 3039, 75, 1568, 1143, 1835, 2271, 
5 2621, 1104, 339, 2322, 3361 and 935, were analyzed for surface reactivity with 
whole Streptococcus pneumoniae cells by FACS analysis as described above. The 
results of the analyses are shown in Table 18. Streptococcus pneumoniae type 3 
cells showed a 9-fold increase in geometric mean fluorescence intensity when 
labeled with antiserum to ORF 2621 (SEQ ID NO:363). A less intense fluorescence 

10 intensity was detected with antisera directed against the proteins of ORF 1835 (SEQ 
ID NO:315), ORF 2271 (SEQ ID NO:343), ORF 75 (SEQ ID NO:218), ORF 1143 
(SEQ ID NO:285), and ORF 1 104 (SEQ ID NO:282). Nine of the antisera tested did 
not show any detectable surface reactivity with the Streptococcus pneumoniae type 
19F strain. This may be due to the level of sensitivity of the technique or the capsule 

15 of 19F covering the surface exposed proteins more completely under the conditions 
tested. 

Analysis of ORF mRNA expression in vitro vs. in vivo. Forward and 
reverse mining primers were used to amplify the full length message for several 
ORFs, identified by mining algorithms as potential vaccine antigens (Example 1), 

20 from type 3 and type 14 cells grown under in vitro and in vivo conditions. In three of 
the four ORFs examined, message was detected in both in vitro and in vivo grown 
cells. For ORFs 1104 (SEQ ID NO:282) and 1568 (SEQ ID NO:306), the detection of 
message correlated with the presence of an immunoreactive band on a Western blot 
of whole cell lysates for the same serotypes. However for ORF 2322 (SEQ ID 

25 NO:345), message was detected in both serotype 3 and 14, but no immunoreactive 
band was present for those serotypes, indicating that either the protein was secreted 
or that the antibodies generated by the recombinant protein did not recognize the 
native protein. No message was detected for ORF 935 (SEQ ID NO:265) in either 
growth condition, which correlates with the absence of an immunoreactive band on a 

30 Western blot. In a separate experiment, message of the expected size was detected 
from RNA made from serotype 14 grown in vitro for ORFs 1143 (SEQ ID NO:285), 
1475 (SEQ ID NO:298), 3039 (SEQ ID NO:380), 2271 (SEQ ID NO:343), 3115 (SEQ 
ID NO:388) and 3361 (SEQ ID NO:402)(data not shown). 
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Discussion 

Prediction of surface exposure is a critical step for genomic mining efforts for 
identifying candidate antigens. The algorithms utilized herein have been shown in 
5 the past to have predictive value for selecting candidate ORFs to examine. The 
results shown here demonstrate the utility of the algorithms for Streptococcus 
pneumoniae and that they represent an advance over the previously utilized 
algorithms. Here, 7 out of 13 proteins from ORFs tested are shown to be surface 
exposed by at least two of the techniques employed. These techniques, including 

10 whole cell ELISA and FACS analysis of whole Streptococcus pneumoniae cells, have 
different strengths for detection of surface exposed epitopes of proteins. Whole cell 
ELISA utilizes fixed cells bound to a solid phase support, while FACS analysis uses 
living Streptococcus pneumoniae in liquid suspension. However, the whole cell 
ELISA is more sensitive than the FACS analysis, and can thus give a more 

15 quantitative determination of surface exposed epitopes at low levels of antibody 
binding. It is not known why the protein of ORF 2621 was so strongly positive in the 
FACS analysis, yet had a comparatively low whole cell ELISA titer (Table 18). This 
may be the result of differing growth conditions or the differing detection conditions 
employed in each of the assays. However, the data are consistent in that the 

20 proteins from 6 ORFs that are noted to have surface exposed epitopes all are 
positive in both assays employed. 

The lack of detection of surface exposure in the 19F strain by FACS is 
puzzling. None of the ORFs had detectable epitopes on the surface of the 1 9F strain 
in the FACS technique used, but the majority of them were well expressed in whole 

25 cell lysates from this strain (Table 18). This may be due to the unique capsular 
material of 1 9F covering the surface exposed proteins, or that the FACS technique is 
less sensitive against type 19F cells. It is also possible that none of the proteins 
tested have surface exposed epitopes in type 19F, but this is extremely unlikely, 
since even antiserum against another known candidate (PhpA protein) (Zhang et al., 

30 2001) that is surface exposed produced much less detectable surface antibody 
binding in FACS analysis as compared to type 3 cells (data not shown). 

The failure to detect surface reactive antibody in the type 14 whole cell ELISA 
(Table 1 8) was also most likely due to the growth of the cells or the assay conditions, 
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because the standard sera employed gave a much lower titer than normally 
observed. 

The RT-PCR data serve to reinforce the potential of the candidate proteins 
from these ORF's. The data show that Streptococcus pneumoniae grown either in 
5 vitro or in vivo produce mRNA specific for the ORFs examined. Since it is known that 
the ORFs are expressed in vitro, it is likely that they are also expressed in vivo as 
well. Experiments are in progress to confirm this using whole cell lysates from in vivo 
grown cells. 

Not every ORF analyzed couid be shown to be expressed in Streptococcus 
10 pneumoniae. For example, a protein from ORF 935 was not detected by Western 
blot analysis, whole cell ELISA (Table 18), or RT-PCR (data not shown). It may be 
that ORF 935 is only expressed under "real" in vivo conditions or that the sequencing 
of the region is incorrect and the expressed protein is out of frame with the true 
protein produced by Streptococcus pneumoniae. 

15 
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Example 3 

Streptococcus pneumoniae Proteome Analysis 

Materials and Methods 

5 Bacteria and media . S. pneumoniae type III (ATCC #6303) was obtained 

from the American Type Culture Collection, Manassas, VA. S. pneumoniae type 19F 
was obtained from Dr. Gerald Schiffman, State University of New York, Brooklyn, NY. 
A glycerol stock plate on Tryptic Soy Agar II (TSA ll)/5.0% sheep blood plate (Becton 
Dickinson Microbiology Systems, Cockeysville, MD) was prepared and incubated 

10 overnight, at 37°C in the presence of 5.0% C0 2 . Cells from each plate were 
transferred to 20 ml of Todd-Hewitt Broth/0.5% Yeast Extract (THY) and incubated 
overnight at 37°C with gentle shaking {10 rpm) in the presence of 5.0% C0 2 . For 
type 3, the culture was then diluted 10 fold with 100 ml of THY. For type 19F, the 
culture was then diluted 40 fold with 200 ml of THY. Both of these diluted cultures 

15 were subsequently incubated under the above conditions. Type 19F required 9 h 
incubation time to reach a concentration of 1 x 10 9 cells/ml. Type 3 was incubated 
overnight and its concentration was not determined. 

Isolation of membrane fraction . The bacterial cultures were spun down and 
washed with PBS/MgSO 4 (30 mM sodium phosphate/150 mM NaCI/1 mM MgS0 4 , pH 

20 6.8). The pellets were resuspended in 4 ml of PBS/MgS0 4 containing 5 ug 
Lysozyme (Sigma Chemical Co., St. Louis, MO), and 400 ug Mutanolysin (Sigma). 
The samples were incubated at 37°C for 1 hour with shaking. After the incubation, 
-300 units of RNAse Cocktail™ (Ambion Inc., Austin, TX) was added to each 
sample. The samples were centrifuged at low speed using a tabletop centrifuge (2.5 

25 k rpm, 10 min, at 4°C). The supernatant was subsequently spun at high speed to 
pellet the membrane fractions using a Beckman (Beckman Instruments, Inc., Palo 
Alto, CA) Model L8-70M Preparative Ultracentrifuge (60Ti rotor, at 40k rpm, 4°C, 1 
h). The supernatant was removed and the membrane pellet was washed with 
PBS/MgS0 4 . 

30 Trypsin digestion of excised SDS-PAGE gel bands . Mini SDS-PAGE gels (10 

cm x 10 cm) were run with precast 10-20% (w/v, acrylamide) gradient gels (Zaxis, 
Hudson, OH) at 200 V. The See Blue molecular weight standard used was obtained 
from Invitrogen, Carlsbad, CA. The gels were stained with Simply Blue Safestain, a 
colloidal Coomassie Blue G250 stain (Invitrogen) as per manufacturer's instructions. 
-125- 
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Each sample lane, in its entirety, was cut into 15 different bands. For each sample, 
bands representing identical molecular weight areas of the gel from three sample 
lanes, run next to each other, were collected together for further processing. The gel 
slices were washed twice with 0.5 ml of 50% (v/v) aqueous HPLC grade acetonitrile 
5 (Burdick & Jackson, Muskegon, Ml) for 5 min with gentle shaking and stored frozen 
at -20°C following removal of the wash liquid. Frozen gel bands were thawed and 
cut into 1 mm cubes and subjected to in-gel trypsin digestion using a DigestPro robot 
(ABIMED Analysen-Technik GmbH, Langenfeld, Germany). In the configuration 
used, up to 30 samples could be processed simultaneously. The automated protocol 

10 consisted of the following steps in order: reduction of the protein in the gel bands with 
dithiothreitol, alkylation with iodoacetamide, digestion with trypsin and elution of the 
peptides. Sequencing Grade Modified Trypsin obtained from Promega Corporation, 
Madison, Wl was used. This trypsin is highly specific for hydrolysis of peptide bonds 
at the carboxylic sides of lysine and arginine residues. It is modified by reductive 

15 methylation to make it extremely resistant to autolysis, which can generate 
pseudotrypsin with chymotrypsin-like specificity. Specificity is further improved by 
treatment with L-1-chloro-3-tosylamido-4-phenylbutan-2-one (TPCK) followed by 
affinity purification. The peptide digests were collected, dried using a SpeedVac 
(Thermo Savant, Holbrook, NY) to -10 pi, and subsequently diluted to 50 pi with 0.1 

20 M acetic acid. Samples were transferred to plastic autosampler vials, sealed, and 
injected using a 5 pi sample loop. 

Microcapillary LC-Mass Spectrometry . Mass spectral data were acquired on 
a Thermo Finnigan LCQ DECA quadrupole ion trap mass spectrometer (Thermo 
Finnigan, San Jose, CA) equipped with a microcapillary reversed-phase 

25 HPLC/micro-electrospray interface. Peptide extracts were analyzed on an 
automated microelectrospray reversed phase HPLC. The microelectrospray 
interface consisted of a Picofrit fused silica spray needle, 10 cm length by 75 pm ID, 
15 pm orifice diameter (New Objective, Cambridge, Massachusetts) packed with 10 
pm Cib reversed-phase beads (YMC, Wilmington, North Carolina) to a length of 10 

30 cm. The Picofrit needle was mounted in a fiber optic holder (Melles Griot, Irvine, 
California) held on a base positioned at the front of the mass spectrometer detector. 
The rear of the column was plumbed through a titanium union to supply an electrical 
connection for the electrospray interface. The union was connected with a length of 
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fused silica capillary (FSC) tubing to a FAMOS autosampler (LC-Packings, San 
Francisco, California) that was connected to an HPLC solvent pump (ABI 140C, 
Perkin-Elmer, Norwalk, Connecticut). The HPLC solvent pump delivered a flow of 50 
uL/min. which was reduced to 250 nl/min. using a PEEK microtight splitting tee 
5 (Upchurch Scientific, Oak Harbor, Washington), and then delivered to the 
autosampler using an FSC transfer line. The HPLC pump and autosampler were 
each controlled using their internal user programs. 

Five microliters of the tryptic digest was separated using the C 18 
microcapillary HPLC column eluting directly into the orifice of the mass spectrometer. 

10 Peptides were separated at a flow rate of 250 nl/min using a 50 minute gradient of 4- 
65% (v/v) acetonitrile in 0.1 M acetic acid. Peptide analyses were conducted on the 
LCQ-DECA ion trap mass spectrometer operating at a spray voltage of 1.5 kV, and 
using a heated capillary temperature of 140° C. Data were acquired in automated 
MS/MS mode using the data acquisition software provided with the instrument. As 

15 the peptides elute from the HPLC into the mass spectrometer, they are detected and 
fragmented in a data dependent manner using "dynamic exclusion". In this 
technique, the ion trap cycles between full scan and collision induced dissociation 
(CID) mode, first detecting candidate ions, and then collecting them for 
fragmentation. Decisions about which ions are going to be fragmented are 

20 performed by the instrument "on the fly". The ions, once collected, are then added to 
an exclusion list and are rejected for a window of two minutes. This technique allows 
the instrument to distribute its time efficiently when presented with analytes of very 
high complexity. The operation can result in the collection of as many as 1000 to 
2000 fragmentation (CID) spectra in a single run. The acquisition method included 1 

25 MS scan (375-600 m/z) followed by MS/MS scans of the top 2 most abundant ions in 
the MS scan. The instrument then conducted a second MS scan (600-1000 m/z) 
followed by MS/MS scans of the top 2 most abundant ions in that scan. The dynamic 
exclusion and isotope exclusion functions were employed to increase the number of 
peptide ions that were analyzed (settings: 3 amu = exclusion width, 3 min = exclusion 

30 duration, 30 sec = pre-exclusion duration, 3 amu = isotope exclusion width). For the 
current experiment involving 30 samples, the data was collected in a completely 
automated fashion over 48 hours using the autosampler. 
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Sequence database search for identification of proteins from CID spectra . 
Automated analysis of MS/MS data was performed using the SEQUEST computer 
algorithm incorporated (Eng, McCormack and Yates, 1994) into the Finnigan 
Bioworks data analysis package (ThermoFinnigan, San Jose, California) using the 
5 protein sequence databases described below. SEQUEST is highly computation 
intensive, the searches for this study were performed on a dedicated 12 x 600 MHz 
PC cluster. Peptide matches with Xcorr values greater than 2.0 were loaded into a 
database for further computational analysis followed by manual verification of the 
data where necessary (as described below). 

10 

Results and Discussion 

Proteomics Based Approach 

The term 'proteome' has been defined as the prote ins expressed by the 
gen ome of an organism or tissue. One of the primary goals of analysis of the 

15 proteome or proteomics involves identification of proteins in a large-scale high- 
throughput format. Bacterial membrane preparations constitute a very important 
source for surface localized proteins, which are likely candidate antigens. A 
proteomics based approach was taken to identify the protein components of the 
complex mixture of proteins contained in the membrane fraction of Streptococcus 

20 pneumoniae. The study of membrane associated proteins offers a very specific and 
significant challenge for proteomics. The detergents required to keep these proteins 
in aqueous solution usually interfere with analytical methods. During two- 
dimensional (2-D) gel electrophoresis, which has been widely used for the analysis of 
soluble proteins, severe quantitative loss of membrane proteins is often observed. 

25 The problem is more severe when immobilized pH gradients are used in the first 
dimension. To minimize such solubility problems with membrane preparations from 
some other bacteria, several sample preparations, as well as some novel zwitterionic 
detergents were tested; all of which were shown to improve the analysis of 
membrane proteins by 2-D gel electrophoresis. However, applicants believe their 

30 success in identifying the major set of outer membrane proteins was quite limited. In 
view of this, a novel combination of a very simple and a very complex method for 
identification of the membrane proteome component of Streptococcus pneumoniae 
has been applied, as described below. 
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In this approach, the membrane preparation was first separated by sodium 
dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) using a mini gel 
format, followed by staining of the gel with a colloidal Coomassie blue stain. Fifteen 
gel bands containing the entire sample lane were excised and the bands digested 
5 individually with trypsin. The tryptic peptides were analyzed using microcapillary 
reversed-phase liquid chromatography-micro-electrospray tandem mass 
spectrometry (LC-MS/MS) on a Finnigan LCQ Deca quadrupole ion trap mass 
spectrometer. Tandem mass spectrometry (MS/MS) has been shown to be a 
powerful approach to analyze proteins (Eng, McCormack and Yates, 1994). In the 

10 first step, MS/MS uses a mass analyzer to separate a peptide ion from a mixture of 
ions, then uses a second step or mass analyzer to activate and dissociate the ion of 
interest. This process, known as collision-induced dissociation (CID), causes the 
peptide to fragment at the peptide bonds between the amino acids, and the 
fragmentation pattern of a peptide is used to determine its amino acid sequence. 

15 The SEQUEST computer algorithm (Eng, McCormack and Yates, 1994) was used to 
search the uninterpreted experimental fragmentation spectra against protein or 
translated nucleotide sequence databases to identify the proteins present in each gel 
band. SEQUEST conceptually digests protein sequences in a database into tryptic 
peptides and then models them into simulated CID spectra using the known rules of 

20 peptide fragmentation. SEQUEST then compares these simulated CID spectra 
against the experimental spectra and returns a list of probable peptide sequences 
matching the raw data along with different parameters representing the fidelity of the 
match. For peptides above roughly 800-900 Dalton in size, a single spectrum can 
uniquely identify a protein. 

25 To obtain sequence information on multiple peptides from the complex 

mixture generated by trypsin digestion of the SDS-PAGE gel bands, a reversed 
phase chromatography system was coupled to an electrospray ion trap mass 
spectrometer. In this system, it is known that high sensitivity (down to sub-femtomole 
levels) can be attained by minimizing both flow rate and column diameter to 

30 concentrate the elution volume and direct as much of the column effluent as possible 
into the orifice of the mass spectrometer. To maximize the coverage of proteins 
present in the sample, the data-dependent acquisition feature of the ion trap was 
employed. Dynamic exclusion was used to prevent reacquisition of tandem mass 
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spectra of ions once a spectrum had been acquired for a particular m/z value. Use of 
these data-dependent features dramatically increased the number of peptide ions 
that were selected for CID analysis. 

The LC-MS/MS data acquisition conditions described above typically resulted 
5 in fragmentation data for more than 2000 peptide ions for each run. Using the 
SEQUEST algorithm, this data was correlated against two protein sequence 
databases. The first one, SnA6F6, contained open reading frames obtained from 
translation of Streptococcus pneumoniae type 4 genome sequence (TIGR4) in all six 
reading frames with the smallest peptide containing six amino acid residues. The 

10 second one, nr, is a non-redundant GenBank protein sequence database. 
SEQUEST search conditions used trypsin selectivity for both of the searches. The 
SnA6F6 search allowed a differential search of +16 Dalton for methionine residues to 
account for peptides displaying oxidation of methionine. 

Candidate matches identified by SEQUEST were confirmed using the 

15 following procedure. For each peptide, SEQUEST computes a Xcorr value from 
cross correlation of the experimental MS/MS spectrum with the candidate peptides in 
the sequence database. The Xcorr is a measure of the similarity of the experimental 
MS/MS data to that generated from the sequence database. Peptide matches with 
Xcorr values greater than 2.0 were selected for further analysis and loaded on to an 

20 in-house developed system for analysis of SEQUEST data using the commercially 
available Oracle® relational database system. Since the SEQUEST output is quite 
complex, applicants incorporated a new scoring algorithm in Oracle® to calculate a 
match score for each protein identified as follows: 

Protein Score = n £(Xcorr/rank) 

25 where the rank is that assigned by SEQUEST for each peptide sequence identified 
from a specific protein sequence in the database and n is the number of unique 
peptides identified for that protein, since the same peptide may be identified multiple 
times in an LC-MS/MS experiment. The fragmentation spectra for all moderate or 
weak assignments by the software used were checked manually by direct 

30 examination of the CID spectra for reasonable signal/noise ratio, and the list of 
matched ions was also examined for reasonable continuity. Generally three or more 
spectra converging with reasonable Protein Score (usually >25) or Xcorr values 
(usually >2.5) onto a single database entry constitutes a convincing identification. 
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The rationale behind the experimental proteomics approach for 
characterization of membrane associated proteins of Streptococcus pneumoniae was 
that the single SDS-PAGE step circumvented the solubility complications associated 
with isoelectric focusing in 2-D gel electrophoresis. It also offered a simple 
5 fractionation of the membrane preparation according to molecular weight that 
reduced the complexity of the samples subjected to LC-MS/MS analysis. The 
combination of these analytical techniques allowed us to separate and obtain 
sequence information of multiple peptides with high sensitivity over a large 
concentration range and identify the corresponding proteins by correlation with 

10 sequences in databases. As part of this study, a method for the isolation of 
membrane preparations from Streptococcus pneumoniae was also developed. This 
involved enzymatic digestion of Streptococcus pneumoniae cell walls with 
mutanolysin and lysozyme in a hypotonic buffer followed by differential centrifugation. 
The twenty-eight ORFs representing surface exposed proteins were also identified by 

15 the proteomic approach and are presented in Table 11. The ORFs representing 
membrane associated proteins and identified by the proteomic approach are 
presented in Table 12. Table 14 contains all the open reading frames identified from 
the SnA6F6 database representing the TIGR4 genomic sequence. Table 14 also 
contains proteins identified from the nr database search which do not originate from 

20 the TIGR4 genome. 

Combination of Genomics and Proteomics Approaches 

The ORFs identified by proteomics represent surface localized, surface 
exposed or membrane associated proteins of Streptococcus pneumoniae. Those 
25 twenty-eight ORFs that support the putative surface exposed ORFs identified by 
genomics approaches {i.e., Tables 1-10) are listed in Table 11 and provide further 
evidence of surface localization of these candidates. The 161 novel ORFs identified 
by proteomics as membrane associated are listed in Table 12. 

30 Example 4 

IMMUNOGOLD LABELING OF STREPTOCOCCUS PNEUMONIAE AND LOW VOLTAGE 

Scanning Electron Microscopy 

Surface exposure of proteins on Streptococcus pneumoniae may also be 
assessed by immunogold labeling of whole bacteria and electron microscopy. 
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Bacteria cells are labeled as previously described (Olmsted et a/., 1993). Briefly, 
late-log phase bacterial cultures are washed twice, and resuspended to a 
concentration of 1 x 10 8 cells/ml in 10 mM phosphate buffered saline (PBS) (pH 7.4) 
and placed on poIy-L-lysine coated glass coverslips. Excess bacteria are gently 
5 washed from the coverslips and unlabeled samples are placed into fixative (2.0% 
glutaraldehyde, in a 0.1 M sodium cacodylate buffer containing 7.5% sucrose) for 30 
min. Bacteria to be labeled with colloidal gold are washed with PBS containing 0.5% 
bovine serum albumin, and the pre-immune or hyper-immune mouse polyclonal 
antibody prepared above applied for 1 hour at room temperature. Bacteria are then 

10 gently washed, and a 1:6 dilution of goat anti-mouse conjugated to 18 nm colloidal 
gold particles (Jackson ImmunoResearch Laboratories, Inc., West Grove, PA) 
applied for 10 min at room temperature. Finally, all samples are washed gently with 
PBS, and placed into the fixative described above. The fixative is washed from 
samples twice for 10 min in 0.1 M sodium cacodylate buffer, and postfixed for 30 min 

15 in 0.1 M sodium cacodylate containing 1% osmium tetroxide. The samples are then 
washed twice with 0.1 M sodium cacodylate, dehydrated with successive 
concentrations of ethanol, critical point dried by the C0 2 method of Anderson 
(Anderson, 1951) using a Samdri-780A (Tousimis, Rockville, MD), and coated with a 
1-2 nm discontinuous layer of platinum. Streptococcus pneumoniae cells are viewed 

20 with a LEO 1550 field emission scanning electron microscope operated at low 
accelerating voltages (1-4.5 keV) using a secondary electron detector for 
conventional topographical imaging and a high-resolution Robinson backscatter 
detector to enhance the visualization of colloidal gold by atomic number contrast. 

25 Example 5 

in vitro opsonphagocytos1s analysis 

An in vitro opsonic reaction, that may mimic the in vivo reaction, is conducted 
by incubating together a mixture of Streptococcus pneumoniae cells, heat inactivated 
human serum containing specific antibodies to the pneumococcal strain, and an 
30 exogenous complement source. Opsonophagocytosis proceeds during incubation of 
freshly isolated human polymorphonuclear cells (PMN's) and the 
antibody/complement/pneumococcal cell mixture. Bacterial cells that are coated with 
antibody and complement are killed upon opsonophagocytosis. Colony forming units 
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(cfu) of surviving bacteria that escape from opsonophagocytosis are determined by 
plating the assay mixture. Titers are reported as the reciprocal of the highest dilution 
that gives > 50% bacterial killing, as determined by comparison to assay controls. 
Specimens which demonstrate less than 50% killing at the lowest serum dilution 
5 tested (1 :8), are reported as having an OPA titer of 4. The highest dilution tested is 
1:2560. Samples with s 50% killing at the highest dilution are repeated, beginning 
with a higher initial dilution. 

The present method is a modification of Gray's method (Gray, B.M. 1990). 
The assay mixture is assembled in a 96-well microtiter tissue culture plate at room 

10 temperature. The assay mixture consists of 1 0 ul_ of test serum (a series of two-fold 
dilutions) heated to 56°C for 30 minutes prior to testing, 10 uL of preclostral bovine 
serum (complement source) having no opsonic activity for the bacterial test strain, 
and 20 uL of buffer containing 2000 viable Streptococcus pneumoniae organisms. 
This mixture is incubated at 37°C without CO2 for 30 minutes with shaking. Next, 40 

15 uL of human PMNs, freshly prepared from heparinized peripheral blood by dextran 
sedimentation and Percoll density centrifugation, suspended in buffer at a 
concentration of 1 x 10 6 /mL is added. The assay plate(s) are then incubated at 37°C 
for an additional 90 minutes with vigorous shaking. Aliquots from each well are 
dispensed onto the upper 1/4 of a 15 x 100 mm blood agar plate. The blood agar 

20 plate is tilted while pipetting to allow the liquid suspension to "run" down the plate. 
Plates are incubated overnight in 5% CO2 at 37°C. The viable cfu are counted the 
following morning. Negative control wells, lacking bacterial cells, test serum, 
complement and/or phagocytes in appropriate combination are included in each 
assay. A test serum control, which contains test serum plus bacterial cells and heat 

25 inactivated complement, is included for each individual serum. This control can be 
used to assess whether the presence of antibiotics or other serum components are 
capable of killing the bacterial strain directly {i.e. in the absence of complement or 
PMN's). A human serum with known opsonic titer is used as a positive human serum 
control. The opsonic antibody titer for each unknown serum is calculated as the 

30 reciprocal of the initial dilution of serum giving 50% cfu reduction compared to the 
control without serum. 
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Example 6 

Intranasal or parenteral immunization of 
CBA/CaHN mice prior to challenge 

Six-week old, pathogen-free, male CBA/CaHN xid/J (CBA/N) mice are 
5 purchased from Jackson Laboratories (Bar Harbor, Maine) and housed in cages 
under standard temperature, humidity, and lighting conditions. CBA/N mice, at 10 
animals per group, are immunized with an appropriate amount of the protein(s) to be 
tested. For parenteral immunization, the protein is mixed with 100 fxg of MPL™ per 
dose to a final volume of 200 |xl in saline and then injected subcutaneously (SC) into 

10 mice. All groups receive a booster with the same dose and by the same route 3 and 
5 weeks after the primary immunization. Control mice are injected with MPL™ alone. 
All mice are bled two weeks after the last boosting; sera is then isolated and stored at 
-20°C. For intranasal (IN) immunization, mice receive three IN immunizations, one 
week apart. On each occasion, an appropriate dose of the protein to be tested is 

15 formulated with 0.1 j^g of CT-E29H, a genetically modified cholera toxin that is 
reduced in enzymatic activity and toxicity (Tebbey et a/., 2000), and slowly instilled 
into the nostril of each mouse in a 10 jj.I volume. Mice immunized with CT-E29H 
alone are used as controls. Serum samples are collected one week after the last 
immunization. 

20 

Example 7 
LD 50 determination 

Six or 12-week old CBA/N mice (10 per group) are challenged intranasally 
(IN) with 10 jal of a suspension of streptomycin resistant type 3 Streptococcus 

25 pneumoniae diluted to 5 x 10 9 CFU/ml in PBS. Two-fold serial dilutions of this 
suspension are also tested. The actual doses of bacteria administered are 
determined by plating dilutions of the inoculum on streptomycin containing tryptic soy 
agar plates. The LD 50 is calculated by the Reed-Muench method as discussed by 
Lennette (Lennette, 1995). The LD 50 of 13-week old CBA/N mice with type 3 strain 

30 was previously shown to be 1 x 10 5 CFU, while the LD 50 of 6-week old CBA/N mice 
was1 x10 4 CFU. 
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Example 8 

CBA/CaHN xid Mouse intranasal challenge model 

Mice are challenged with either serotype 3 or serotype 14 streptomycin 
5 resistant Streptococcus pneumoniae. Pneumococci are inoculated into 3 ml of Todd- 
Hewitt broth containing 1 00 ng/ml of streptomycin. The culture is grown at 37°C until 
mid-log phase, then diluted to the desired concentration with Todd-Hewitt broth and 
stored on ice until use. Each mouse is anesthetized with 1 .2 mg of ketamine HCI 
(Fort Dodge Laboratory, Ft. Dodge, Iowa) by intraperitoneal (IP) injection. The 

10 bacterial suspension is inoculated to the nostril of anesthetized mice (10 jjJ per 
mouse). The actual dose of bacteria administered is confirmed by plate count. Two 
or 3 days after challenge, mice are sacrificed, the noses are removed, and 
homogenized in 3-ml sterile saline with a tissue homogenizer (Ultra-Turax T25, 
Janke & Kunkel Ika-Labortechnik, Staufen, Germany). The homogenate is 10-fold 

15 serially diluted in saline and plated on streptomycin containing TSA plates. Fifty jjJ of 
blood collected 2 days post-challenge from each mouse are also plated on the same 
kind of plates. Plates are incubated overnight at 37°C and then colonies are counted;, 
CBA/N mice are observed daily after challenge, and the mortality is monitored for 14 
days. 

20 

Example 9 

Intranasal immunization of Balb/c mice prior to challenge 

Six-week old, pathogen-free, Balb/c mice are purchased from Jackson Laboratories 
(Bar Harbor, Maine) and housed in cages under standard temperature, humidity, and 

25 lighting conditions. BALB/C mice, at 10 animals per group, are immunized with an 
appropriate amount of the protein to be tested on weeks 0, 2, and 4. On each 
occasion, the protein being tested is formulated with 0.1 |xg of CT-E29H, and slowly 
instilled into the nostril of each mouse in a 10 ^l volume. Mice immunized with 
Keyhole Limpet Hemocyanin (KLH)-CT-E29H are used as controls. Serum samples 

30 are collected 4 days after the last immunization. 
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Example 10 
Mouse intranasal challenge model 

Balb/c mice are challenged on the sixth day of week 4 (i.e., at approximately 
27 days) with 1X1 0 5 CFU's of serotype 3 streptomycin resistant Streptococcus 
5 pneumoniae. Pneumococci are inoculated into 3 ml of Todd-Hewitt broth containing 
100 p.g/ml of streptomycin. The culture is grown at 37°C until mid-log phase, then 
diluted to the desired concentration with Todd-Hewitt broth and stored on ice until 
use. Each mouse is anesthetized with 1.2 mg of ketamine HCI (Fort Dodge 
Laboratory, Ft. Dodge, Iowa) by i.p. injection. The bacterial suspension is inoculated 

10 into the nostril of anesthetized mice (10 p.l per mouse). The actual dose of bacteria 
administered is confirmed by plate count. Four days after challenge, mice are 
sacrificed, the noses removed, and homogenized in 3-ml sterile saline with a tissue 
homogenizer (Ultra-Turax T25, Janke & Kunkel Ika-Labortechnik, Staufen, 
Germany). The homogenate is 10-fold serially diluted in saline and plated on 

15 streptomycin containing TSA plates. Fifty p.1 of blood collected 2 days post-challenge 
from each mouse also is plated on the same kind of plates. Plates are incubated 
overnight at 37°C and then colonies are counted. 
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What is Claimed is : 

1. An isolated polynucleotide of a Streptococcus pneumoniae genomic 
sequence, wherein the polynucleotide comprises a nucleotide sequence 
having at least about 95% identity to a nucleotide sequence chosen from one 
of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO:431 through SEQ 
ID NO:591 , a degenerate variant thereof, or a fragment thereof. 

2. The polynucleotide of claim 1, wherein the polynucleotide is a complement to 
a nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID 
NO: 215 or SEQ ID NO:431 through SEQ ID NO:591, a degenerate variant 
thereof, or a fragment thereof. 

3. The polynucleotide of claim 2, wherein the polynucleotide is selected from the 
group consisting of DNA, chromosomal DNA, cDNA and RNA. 

4. The polynucleotide of claim 3, wherein the polynucleotide further comprises 
heterologous nucleotides. 

5. An isolated polynucleotide which hybridizes to a nucleotide sequence chosen 
from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO:431 
through SEQ ID NO:591, a complement thereof, a degenerate variant thereof, 
or a fragment thereof, under high stringency hybridization conditions. 

6. The polynucleotide of claim 5, wherein the polynucleotide hybridizes under 
intermediate stringency hybridization conditions. 

7. An isolated polynucleotide of a Streptococcus pneumoniae genomic 
sequence, wherein the polynucleotide comprises a nucleotide sequence 
chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID 
NO:431 through SEQ ID NO:591, a fragment thereof, or a degenerate variant 
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thereof, and encodes a polypeptide, a biological equivalent thereof, or a 
fragment thereof, selected from the group consisting of: 

(a) a Streptococcus pneumoniae polypeptide having 0, 1 or 2 
transmembrane domains; 

(b) a Streptococcus pneumoniae polypeptide having 3 or more 
transmembrane domains; 

(c) a Streptococcus pneumoniae polypeptide having an outer membrane 
domain or a periplasmic domain; 

(d) a Streptococcus pneumoniae polypeptide having an inner membrane 
domain; 

(e) a Streptococcus pneumoniae polypeptide identified by Blastp analysis; 

(f) a Streptococcus pneumoniae polypeptide identified by Pfam analysis; 

(g) a Streptococcus pneumoniae lipoprotein; 

(h) a Streptococcus pneumoniae polypeptide having a LPXTG motif, 
wherein the polypeptide is covalently attached to the peptidoglycan 
layer; 

(i) a Streptococcus pneumoniae polypeptide having a peptidoglycan 
binding motif, wherein the polypeptide is associated with the 
peptidoglycan layer; 

0) a Streptococcus pneumoniae polypeptide having a signal sequence 
and a C-terminal Tyrosine or a C-terminal Phenylalanine amino acid; 

(k) a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
amino acid sequence; 

(I) a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed; 

and 

(m) a Streptococcus pneumoniae polypeptide identified by proteomics as 
membrane associated. 

8. The polynucleotide of claim 7, wherein the polynucleotide is a complement to 
a nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID 
NO: 215 or SEQ ID NO:431 through SEQ ID NO:591, a degenerate variant 
thereof, or a fragment thereof. 
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9. The polynucleotide of claim 8, wherein the polynucleotide is selected from the 
group consisting of DNA, chromosomal DNA, cDNA and RNA. 

1 0. The polynucleotide of claim 9, wherein the polynucleotide further comprises 
heterologous nucleotides. 

1 1 . The polynucleotide of claim 10, wherein the polynucleotide encodes a fusion 
polypeptide. 

12. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having 0, 1 or 2 transmembrane domains comprises a nucleotide 
sequence chosen from one of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, 
SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 
13, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ 
ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 28, 
SEQ ID NO: 29, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 36, SEQ ID 
NO: 39, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, 
SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 53, SEQ ID 
NO: 55, SEQ ID NO: 57, SEQ ID NO: 58, SEQ ID NO: 60, SEQ ID NO: 61, 
SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID 
NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 72, 
SEQ ID NO: 73, SEQ ID NO: 74, SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID 
NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID NO: 89, 
SEQ ID NO: 91, SEQ ID NO: 92, SEQ ID NO: 95, SEQ ID NO: 96, SEQ ID 
NO: 97, SEQ ID NO: 100, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO: 
106, SEQ ID NO: 109, SEQ ID NO: 110, SEQ ID NO: 111, SEQ ID NO: 113, 
SEQ ID NO: 116, SEQ ID NO: 121, SEQ ID NO: 122, SEQ ID NO: 123, SEQ 
ID NO: 125, SEQ ID NO: 126, SEQ ID NO: 127, SEQ ID NO: 128, SEQ ID 
NO: 131, SEQ ID NO: 132, SEQ ID NO: 134, SEQ ID NO: 136, SEQ ID NO: 
137, SEQ ID NO: 138, SEQ ID NO: 141, SEQ ID NO: 142, SEQ ID NO: 143, 
SEQ ID NO: 144, SEQ ID NO: 147, SEQ ID NO: 148, SEQ ID NO: 149, SEQ 
ID NO: 150, SEQ ID NO: 155, SEQ ID NO: 156, SEQ ID NO: 158, SEQ ID 
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NO: 161, SEQ ID NO: 162, SEQ ID NO: 165, SEQ ID NO: 170, SEQ ID NO: 
171, SEQ ID NO: 172, SEQ ID NO: 174, SEQ ID NO: 176, SEQ ID NO: 179, 
SEQ ID NO: 183, SEQ ID NO: 185, SEQ ID NO: 187, SEQ ID NO: 192, SEQ 
ID NO: 195, SEQ ID NO: 196, SEQ ID NO: 197, SEQ ID NO: 199, SEQ ID 
NO: 200, SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 204, SEQ ID NO: 
205, SEQ ID NO: 207, SEQ ID NO: 209 and SEQ ID NO: 210. 

13. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having 3 or more transmembrane domains comprises a 
nucleotide sequence chosen from one of SEQ ID NO: 2, SEQ ID NO: 5, SEQ 
ID NO: 6, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 15, 
SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID 
NO: 30, SEQ ID NO: 31, SEQ ID NO: 33, SEQ ID NO: 35, SEQ ID NO: 37, 
SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID 
NO: 46, SEQ ID NO: 48, SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, 
SEQ ID NO: 59, SEQ ID NO: 65, SEQ ID NO: 71, SEQ ID NO: 75, SEQ ID 
NO: 76, SEQ ID NO: 77, SEQ ID NO: 80, SEQ ID NO: 82, SEQ ID NO: 84, 
SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 90, SEQ ID NO: 93, SEQ ID 
NO: 94, SEQ ID NO: 98, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 102, 
SEQ ID NO: 103, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 112, SEQ 
ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID 
NO: 119, SEQ ID NO: 120, SEQ ID NO: 124, SEQ ID NO: 129, SEQ ID NO: 
130, SEQ ID NO: 133, SEQ ID NO: 135, SEQ ID NO: 139, SEQ ID NO: 140, 
SEQ ID NO: 145, SEQ ID NO: 146, SEQ ID NO: 151, SEQ ID NO: 152, SEQ 
ID NO: 153, SEQ ID NO: 154, SEQ ID NO: 157, SEQ ID NO: 159, SEQ ID 
NO: 160, SEQ ID NO: 163, SEQ ID NO: 164, SEQ ID NO: 166, SEQ ID NO: 
167, SEQ ID NO: 168, SEQ ID NO: 169, SEQ ID NO: 173, SEQ ID NO: 175, 
SEQ ID NO: 177, SEQ ID NO: 178, SEQ ID NO: 180, SEQ ID NO: 181, SEQ 
ID NO: 182, SEQ ID NO: 184, SEQ ID NO: 186, SEQ ID NO: 188, SEQ ID 
NO: 189, SEQ ID NO: 190, SEQ ID NO: 191, SEQ ID NO: 193, SEQ ID NO: 
194, SEQ ID NO: 198, SEQ ID NO: 203, SEQ ID NO: 206, SEQ ID NO: 208, 
SEQ ID NO: 211, SEQ ID NO: 212, SEQ ID NO: 213, SEQ ID NO: 214 and 
SEQ ID NO: 215. 
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14. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having an outer membrane domain or a periplasmic domain 
comprises a nucleotide sequence chosen from one of SEQ ID NO: 3, SEQ ID 
NO: 8, SEQ ID NO: 9, SEQ ID NO: 23, SEQ ID NO: 39, SEQ ID NO: 50, SEQ 
ID NO: 62, SEQ ID NO: 67, SEQ ID NO: 78, SEQ ID NO: 85, SEQ ID NO: 
125, SEQ ID NO: 134, SEQ ID NO: 147, SEQ ID NO: 165, SEQ ID NO: 172 
and SEQ ID NO: 179. 

15. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having an inner membrane domain comprises a nucleotide 
sequence chosen from one of SEQ ID NO: 2, SEQ ID NO: 5, SEQ ID NO: 6, 
SEQ ID NO: 7, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 
13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ 
ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 26, 
SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID 
NO: 31, SEQ ID NO: 32, SEQ ID NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, 
SEQ ID NO: 36, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID 
NO: 43, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, 
SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID 
NO: 56, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID NO: 65, 
SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID NO: 70, SEQ ID NO: 71, SEQ ID 
NO: 73, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 79, 
SEQ ID NO: 80, SEQ ID NO: 81, SEQ ID NO: 82, SEQ ID NO: 83, SEQ ID 
NO: 84, SEQ ID NO: 86 SEQ ID NO: 87, SEQ ID NO: 88, SEQ ID NO: 90, 
SEQ ID NO: 91, SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 95, SEQ ID 
NO: 96, SEQ ID NO: 97, SEQ ID NO: 98, SEQ ID NO: 99, SEQ ID NO: 100, 
SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ ID NO: 105, SEQ 
ID NO: 106, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID 
NO: 112, SEQ ID NO: 113, SEQ ID NO: 114, SEQ ID NO: 115, SEQ ID NO: 
117, SEQ ID NO: 118, SEQ ID NO: 119, SEQ ID NO: 120, SEQ ID NO: 121, 
SEQ ID NO: 122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 126, SEQ 
ID NO: 127, SEQ ID NO: 128, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID 
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NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 135, SEQ ID NO: 
136, SEQ ID NO: 139, SEQ ID NO: 140, SEQ ID NO: 141, SEQ ID NO: 142, 
SEQ ID NO: 144, SEQ ID NO: 145, SEQ ID NO: 146, SEQ ID NO: 148, SEQ 
ID NO: 150, SEQ ID NO: 151, SEQ ID NO: 152, SEQ ID NO: 153, SEQ ID 
NO: 154, SEQ ID NO: 156, SEQ ID NO: 157, SEQ ID NO: 158, SEQ ID NO: 
159, SEQ ID NO: 160, SEQ ID NO: 162, SEQ ID NO: 163, SEQ ID NO: 164, 
SEQ ID NO: 166, SEQ ID NO: 167, SEQ ID NO: 168, SEQ ID NO: 169, SEQ 
ID NO: 170, SEQ ID NO: 173, SEQ ID NO: 175, SEQ ID NO: 176, SEQ ID 
NO: 177, SEQ ID NO: 178, SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 
182, SEQ ID NO: 184, SEQ ID NO: 186, SEQ ID NO: 187, SEQ ID NO: 188, 
SEQ ID NO: 189, SEQ ID NO: 190, SEQ ID NO: 191, SEQ ID NO: 192, SEQ 
ID NO: 193, SEQ ID NO: 194, SEQ ID NO: 195, SEQ ID NO: 198, SEQ ID 
NO: 200, SEQ ID NO: 203, SEQ ID NO: 206, SEQ ID NO: 208, SEQ ID NO: 
209, SEQ ID NO: 211, SEQ ID NO: 212, SEQ ID NO: 213, SEQ ID NO: 214 
and SEQ ID NO: 215. 

16. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide identified by Blastp analysis comprises a nucleotide sequence 
chosen from one of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 7, SEQ ID 
NO: 10, SEQ ID NO: 12, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 24, 
SEQ ID NO: 27, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID 
NO: 33, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 38, SEQ ID NO: 40, 
SEQ ID NO: 42, SEQ ID NO: 43, SEQ ID NO: 44, SEQ ID NO: 48, SEQ ID 
NO: 51, SEQ ID NO: 53, SEQ ID NO: 59, SEQ ID NO: 60, SEQ ID NO: 61, 
SEQ ID NO: 65, SEQ ID NO: 67, SEQ ID NO: 68, SEQ ID NO: 69, SEQ ID 
NO: 70, SEQ ID NO: 71, SEQ ID NO: 75, SEQ ID NO: 76, SEQ ID NO: 77, 
SEQ ID NO: 78, SEQ ID NO: 79, SEQ ID NO: 80, SEQ ID NO: 87, SEQ ID 
NO: 88, SEQ ID NO: 90, SEQ ID NO: 94, SEQ ID NO: 95, SEQ ID NO: 96, 
SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 103, SEQ ID NO: 105, SEQ 
ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID NO: 112, SEQ ID 
NO: 113, SEQ ID NO: 115, SEQ ID NO: 117, SEQ ID NO: 118, SEQ ID NO: 
122, SEQ ID NO: 123, SEQ ID NO: 124, SEQ ID NO: 127, SEQ ID NO: 129, 
SEQ ID NO: 131, SEQ ID NO: 132, SEQ ID NO: 133, SEQ ID NO: 134, SEQ 
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ID NO: 135, SEQ ID NO: 136, SEQ ID NO: 138, SEQ ID NO: 139, SEQ ID 
NO: 141, SEQ ID NO: 144, SEQ ID NO: 146, SEQ ID NO: 147, SEQ ID NO: 
151 , SEQ ID NO: 152, SEQ ID NO: 154, SEQ ID NO: 155, SEQ ID NO: 157, 
SEQ ID NO: 158, SEQ ID NO: 159, SEQ ID NO: 160, SEQ ID NO: 161, SEQ 
ID NO: 162, SEQ ID NO: 163, SEQ ID NO: 165, SEQ ID NO: 166, SEQ ID 
NO: 167, SEQ ID NO: 169, SEQ ID NO: 172, SEQ ID NO: 173, SEQ ID NO: 
176, SEQ ID NO: 177, SEQ ID NO: 178, SEQ ID NO: 180, SEQ ID NO: 181, 
SEQ ID NO: 182, SEQ ID NO: 184, SEQ ID NO: 185, SEQ ID NO: 186, SEQ 
ID NO: 188, SEQ ID NO: 189, SEQ ID NO: 191, SEQ ID NO: 193, SEQ ID 
NO: 196, SEQ ID NO: 197, SEQ ID NO: 198, SEQ ID NO: 199, SEQ ID NO: 
200, SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 204, SEQ ID NO: 205, 
SEQ ID NO: 206, SEQ ID NO: 207, SEQ ID NO: 208, SEQ ID NO: 210, SEQ 
ID NO: 212, SEQ ID NO: 213 and SEQ ID NO: 214. 

17. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide identified by Pfam analysis comprises a nucleotide sequence 
chosen from one of SEQ ID NO: 4, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID 
NO: 41, SEQ ID NO: 45, SEQ ID NO: 55, SEQ ID NO: 57, SEQ ID NO: 58, 
SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 66, SEQ ID NO: 72, SEQ ID 
NO: 74, SEQ ID NO: 89, SEQ ID NO: 92, SEQ ID NO: 104, SEQ ID NO: 111, 
SEQ ID NO: 116, SEQ ID NO: 119, SEQ ID NO: 128, SEQ ID NO: 137, SEQ 
ID NO: 142, SEQ ID NO: 143, SEQ ID NO: 149, SEQ ID NO: 151, SEQ ID 
NO: 152, SEQ ID NO: 153, SEQ ID NO: 157, SEQ ID NO: 159, SEQ ID NO: 
160, SEQ ID NO: 162, SEQ ID NO: 163, SEQ ID NO: 164, SEQ ID NO: 165, 
SEQ ID NO: 166, SEQ ID NO: 169, SEQ ID NO: 171 , SEQ ID NO: 174, SEQ 
ID NO: 176, SEQ ID NO: 180, SEQ ID NO: 182, SEQ ID NO: 183, SEQ ID 
NO: 184, SEQ ID NO: 186, SEQ ID NO: 188, SEQ ID NO 189, SEQ ID NO: 
195, SEQ ID NO: 198, SEQ ID NO 199, SEQ ID NO: 205, SEQ ID NO: 212 
and SEQ ID NO: 213. 

18. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
lipoprotein comprises a nucleotide sequence chosen from one of SEQ ID NO: 
3, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 13, SEQ ID NO: 21, SEQ ID 
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NO: 26, SEQ ID NO: 34, SEQ ID NO: 62, SEQ ID NO: 67, SEQ ID NO: 85, 
SEQ ID NO: 134, SEQ ID NO: 147, SEQ ID NO: 150, SEQ ID NO: 168, SEQ 
ID NO: 170 and SEQ ID NO: 173. 

19. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having a LPXTG motif and covalently attached to the 
peptidoglycan layer comprises a nucleotide sequence chosen from one of 
SEQ ID NO: 13, SEQ ID NO: 21, SEQ ID NO: 34 and SEQ ID NO: 170. 

20. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having a peptidoglycan binding motif and associated with the 
peptidoglycan layer comprises a nucleotide sequence chosen from one of 
SEQ ID NO: 25, SEQ ID NO: 49 and SEQ ID NO: 110. 

21. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having a signal sequence and a C-terminal Tyrosine or a C- 
terminal Phenylalanine amino acid comprises a nucleotide sequence chosen 
from one of SEQ ID NO:11, SEQ ID NO:39, SEQ ID NO:73, SEQ ID NO:97, 
SEQ ID NO:106, SEQ ID NO: 125 and SEQ ID NO:187. 

22. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide having a tripeptide RGD amino acid sequence comprises a 
nucleotide sequence chosen from one of SEQ ID NO:1, SEQ ID NO:21, SEQ 
ID NO:66 and SEQ ID NO:67. 

23. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide identified by proteomics as surface exposed comprises a 
nucleotide sequence chosen from one of SEQ ID NO:14, SEQ ID NO:16, 
SEQ ID NO:17, SEQ ID NO:46, SEQ ID NO:64, SEQ ID NO:66, SEQ ID 
NO:67, SEQ ID NO:69, SEQ ID NO:71, SEQ ID NO:74, SEQ ID NO:91, SEQ 
ID NO:103, SEQ ID NO:116, SEQ ID NO:128, SEQ ID NO:131, SEQ ID 
NO:136, SEQ ID NO:151, SEQ ID NO:156, SEQ ID NO:159, SEQ ID NO:162, 
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SEQ ID NO:164, SEQ ID NO:172, SEQ ID NO:176, SEQ ID NO:178, SEQ ID 
NO:179, SEQ ID NO:180, SEQ ID NO:182 and SEQ ID NO:205. 

24. The polynucleotide of claim 7, wherein the polynucleotide encoding a 
polypeptide identified by proteomics as membrane associated comprises a 
nucleotide sequence chosen from of one of SEQ ID NO:431 through SEQ ID 
NO: 591. 

25. An isolated polypeptide encoded by a polynucleotide of a Streptococcus 
pneumoniae genomic sequence, wherein the polynucleotide comprises a 
nucleotide sequence having at least about 95% identity to a nucleotide 
sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 215 or 
SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant thereof, or a 
fragment thereof. 

26. The polypeptide of claim 25, wherein the polypeptide is a fusion polypeptide. 

27. The polypeptide of claim 25, which immunoreacts with seropositive serum of 
an individual infected with Streptococcus pneumoniae. 

28. The polypeptide of claim 25, further defined as: 

(a) a Streptococcus pneumoniae polypeptide having 0, 1 or 2 
transmembrane domains; 

(b) a Streptococcus pneumoniae polypeptide having 3 or more 
transmembrane domains; 

(c) a Streptococcus pneumoniae polypeptide having an outer membrane 
domain or a periplasmic domain; 

(d) a Streptococcus pneumoniae polypeptide having an inner membrane 
domain; 

(e) a Streptococcus pneumoniae polypeptide identified by Blastp analysis; 

(f) a Streptococcus pneumoniae polypeptide identified by Pfam analysis; 

(g) a Streptococcus pneumoniae lipoprotein; 
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(h) a Streptococcus pneumoniae polypeptide having a LPXTG motif, 
wherein the polypeptide is covalently attached to the peptidoglycan 
layer; 

(i) a Streptococcus pneumoniae polypeptide having a peptidoglycan 
binding motif, wherein the polypeptide is associated with the 
peptidoglycan layer; 

G) a Streptococcus pneumoniae polypeptide having a signal sequence 
and a C-terminal Tyrosine or a C-terminal Phenylalanine amino acid; 

(k) a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
amino acid sequence; 

(I) a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed; 

and 

(m) a Streptococcus pneumoniae polypeptide identified by proteomics as 
membrane associated. 

29. The polypeptide of claim 28, wherein the polypeptide having 0, 1 or 2 
transmembrane domains comprises an amino acid sequence chosen from 
one of SEQ ID NO: 216, SEQ ID NO: 218, SEQ ID NO: 219, SEQ ID NO: 
222, SEQ ID NO: 223, SEQ ID NO: 224, SEQ ID NO: 226, SEQ ID NO: 228, 
SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 233, SEQ ID NO: 234, SEQ 
ID NO: 237, SEQ ID NO: 238, SEQ ID NO: 239, SEQ ID NO: 240, SEQ ID 
NO: 243, SEQ ID NO: 244, SEQ ID NO: 247, SEQ ID NO: 249, SEQ ID NO: 
251, SEQ ID NO: 254, SEQ ID NO: 256, SEQ ID NO: 257, SEQ ID NO: 260, 
SEQ ID NO: 262, SEQ ID NO: 264, SEQ ID NO: 265, SEQ ID NO: 266, SEQ 
ID NO: 268, SEQ ID NO: 270, SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID 
NO: 275, SEQ ID NO: 276, SEQ ID NO: 277, SEQ ID NO: 278, SEQ ID NO: 
279, SEQ ID NO: 281, SEQ ID NO: 282, SEQ ID NO: 283, SEQ ID NO: 284, 
SEQ ID NO: 285, SEQ ID NO: 286, SEQ ID NO: 287, SEQ ID NO: 289, SEQ 
ID NO: 293, SEQ ID NO: 294, SEQ ID NO: 296, SEQ ID NO: 298, SEQ ID 
NO: 300, SEQ ID NO: 301, SEQ ID NO: 304, SEQ ID NO: 306, SEQ ID NO: 
307, SEQ ID NO: 310, SEQ ID NO: 311, SEQ ID NO: 312, SEQ ID NO: 315, 
SEQ ID NO: 319, SEQ ID NO: 320, SEQ ID NO: 321, SEQ ID NO: 324, SEQ 
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ID NO: 325, SEQ ID NO: 326, SEQ ID NO: 328, SEQ ID NO: 331, SEQ ID 
NO: 336, SEQ ID NO: 337, SEQ ID NO: 338, SEQ ID NO: 340, SEQ ID NO: 
341, SEQ ID NO: 342, SEQ ID NO: 343, SEQ ID NO: 346, SEQ ID NO: 347, 
SEQ ID NO: 349, SEQ ID NO: 351, SEQ ID NO: 352, SEQ ID NO: 353, SEQ 
ID NO: 356, SEQ ID NO: 357, SEQ ID NO: 358, SEQ ID NO: 359, SEQ ID 
NO: 362, SEQ ID NO: 363, SEQ ID NO: 364, SEQ ID NO: 365, SEQ ID NO: 
370, SEQ ID NO: 371, SEQ ID NO: 373, SEQ ID NO: 376, SEQ ID NO: 377, 
SEQ ID NO: 380, SEQ ID NO: 385, SEQ ID NO: 386, SEQ ID NO: 387, SEQ 
ID NO: 389, SEQ ID NO: 391, SEQ ID NO: 394, SEQ ID NO: 398, SEQ ID 
NO: 400, SEQ ID NO: 402, SEQ ID NO: 407, SEQ ID NO: 410, SEQ ID NO: 
411, SEQ ID NO: 412, SEQ ID NO: 414, SEQ ID NO: 415, SEQ ID NO: 416, 
SEQ ID NO: 417, SEQ ID NO: 419, SEQ ID NO: 420, SEQ ID NO: 422, SEQ 
ID NO: 424, SEQ ID NO: 425, a fragment thereof or a degenerate variant 
thereof. 

30. The polypeptide of claim 28, wherein the polypeptide having 3 or more 
transmembrane domains comprises an amino acid sequence chosen from 
one of SEQ ID NO: 217, SEQ ID NO: 220, SEQ ID NO: 221, SEQ ID NO: 
225, SEQ ID NO: 227, SEQ ID NO: 229, SEQ ID NO: 230, SEQ ID NO: 235, 
SEQ ID NO: 236, SEQ ID NO: 241, SEQ ID NO: 242, SEQ ID NO: 245, SEQ 
ID NO: 246, SEQ ID NO: 248, SEQ ID NO: 250, SEQ ID NO: 252, SEQ ID 
NO: 253, SEQ ID NO: 255, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID NO: 
261, SEQ ID NO: 263, SEQ ID NO: 267, SEQ ID NO: 269, SEQ ID NO: 271, 
SEQ ID NO: 274, SEQ ID NO: 280, SEQ ID NO: 286, SEQ ID NO: 290, SEQ 
ID NO: 291, SEQ ID NO: 292, SEQ ID NO: 295, SEQ ID NO: 297, SEQ ID 
NO: 299, SEQ ID NO: 302, SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 
308, SEQ ID NO: 309, SEQ ID NO: 313, SEQ ID NO: 314, SEQ ID NO: 316, 
SEQ ID NO: 317, SEQ ID NO: 318, SEQ ID NO: 322, SEQ ID NO: 323, SEQ 
ID NO: 327, SEQ ID NO: 329, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID 
NO: 333, SEQ ID NO: 334, SEQ ID NO: 335, SEQ ID NO: 339, SEQ ID NO: 
344, SEQ ID NO: 345, SEQ ID NO: 348, SEQ ID NO: 350, SEQ ID NO: 354, 
SEQ ID NO: 355, SEQ ID NO: 360, SEQ ID NO: 361, SEQ ID NO: 366, SEQ 
ID NO: 367, SEQ ID NO: 368, SEQ ID NO: 369, SEQ ID NO: 372, SEQ ID 
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NO: 374, SEQ ID NO: 375, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 
381, SEQ ID NO: 382, SEQ ID NO: 383, SEQ ID NO: 384, SEQ ID NO: 388, 
SEQ ID NO: 390, SEQ ID NO: 392, SEQ ID NO: 393, SEQ ID NO: 395, SEQ 
ID NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, SEQ ID NO: 401, SEQ ID 
NO: 403, SEQ ID NO: 404, SEQ ID NO: 405, SEQ ID NO: 406, SEQ ID NO: 
408, SEQ ID NO: 409, SEQ ID NO: 413, SEQ ID NO: 418, SEQ ID NO: 421, 
SEQ ID NO: 423, SEQ ID NO: 426, SEQ ID NO: 427, SEQ ID NO: 428, SEQ 
ID NO: 429, SEQ ID NO: 430, a fragment thereof or a degenerate variant 
thereof. 

31. The polypeptide of claim 28, wherein the polypeptide having an outer 
membrane or a periplasmic domain comprises an amino acid sequence 
chosen from one of SEQ ID NO: 218, SEQ ID NO: 223, SEQ ID NO: 224, 
SEQ ID NO: 238, SEQ ID NO: 254, SEQ ID NO: 265, SEQ ID NO: 277, SEQ 
ID NO: 282, SEQ ID NO: 293, SEQ ID NO: 300, SEQ ID NO: 340, SEQ ID 
NO: 349, SEQ ID NO: 362, SEQ ID NO: 380, SEQ ID NO: 387, SEQ ID NO: 
394, a fragment thereof or a degenerate variant thereof. 

32. The polypeptide of claim 28, wherein the polypeptide having an inner 
membrane domain comprises an amino acid sequence chosen from one of 
SEQ ID NO: 217, SEQ ID NO: 220, SEQ ID NO: 221, SEQ ID NO: 222, SEQ 
ID NO: 225, SEQ ID NO: 226, SEQ ID NO: 227, SEQ ID NO: 228, SEQ ID 
NO: 229, SEQ ID NO: 230, SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 
234, SEQ ID NO: 235, SEQ ID NO: 236, SEQ ID NO: 237, SEQ ID NO: 241, 
SEQ ID NO: 242, SEQ ID NO: 243, SEQ ID NO: 244, SEQ ID NO: 245, SEQ 
ID NO: 246, SEQ ID NO: 247, SEQ ID NO: 248, SEQ ID NO: 249, SEQ ID 
NO: 250, SEQ ID NO: 251, SEQ ID NO: 252, SEQ ID NO: 253, SEQ ID NO: 
255, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID NO: 261, SEQ ID NO: 262, 
SEQ ID NO: 263, SEQ ID NO: 266, SEQ ID NO: 267, SEQ ID NO: 268, SEQ 
ID NO: 269, SEQ ID NO: 271, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID 
NO: 276, SEQ ID NO: 280, SEQ ID NO: 283, SEQ ID NO: 284, SEQ ID NO: 
285, SEQ ID NO: 286, SEQ ID NO: 288, SEQ ID NO: 290, SEQ ID NO: 291, 
SEQ ID NO: 292, SEQ ID NO: 294, SEQ ID NO: 295, SEQ ID NO: 296, SEQ 
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ID NO: 297, SEQ ID NO: 298, SEQ ID NO: 299, SEQ ID NO: 301 SEQ ID 
NO: 302, SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 306, SEQ ID NO: 
308, SEQ ID NO: 309, SEQ ID NO: 310, SEQ ID NO: 311, SEQ ID NO: 312, 
SEQ ID NO: 313, SEQ ID NO: 314, SEQ ID NO: 315, SEQ ID NO: 316, SEQ 
ID NO: 317, SEQ ID NO: 318, SEQ ID NO: 320, SEQ ID NO: 321, SEQ ID 
NO: 322, SEQ ID NO: 323, SEQ ID NO: 324, SEQ ID NO: 327, SEQ ID NO: 
328, SEQ ID NO: 329, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID NO: 333, 
SEQ ID NO: 334, SEQ ID NO: 335, SEQ ID NO: 336, SEQ ID NO: 337, SEQ 
ID NO: 338, SEQ ID NO: 339, SEQ ID NO: 341, SEQ ID NO: 342, SEQ ID 
NO: 343, SEQ ID NO: 344, SEQ ID NO: 345, SEQ ID NO: 346, SEQ ID NO: 
347, SEQ ID NO: 348, SEQ ID NO: 350, SEQ ID NO: 351, SEQ ID NO: 354, 
SEQ ID NO: 355, SEQ ID NO: 356, SEQ ID NO: 357, SEQ ID NO: 359, SEQ 
ID NO: 360, SEQ ID NO: 361, SEQ ID NO: 362, SEQ ID NO: 365, SEQ ID 
NO: 366, SEQ ID NO: 367, SEQ ID NO: 368, SEQ ID NO: 369, SEQ ID NO: 
371, SEQ ID NO: 372, SEQ ID NO: 373, SEQ ID NO: 374, SEQ ID NO: 375, 
SEQ ID NO: 377, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 381, SEQ 
ID NO: 382, SEQ ID NO: 383, SEQ ID NO: 384, SEQ ID NO: 385, SEQ ID 
NO: 388, SEQ ID NO: 390, SEQ ID NO: 391, SEQ ID NO: 392, SEQ ID NO: 
393, SEQ ID NO: 395, SEQ ID NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, 
SEQ ID NO: 401, SEQ ID NO: 402, SEQ ID NO: 403, SEQ ID NO: 404, SEQ 
ID NO: 405, SEQ ID NO: 406, SEQ ID NO: 407, SEQ ID NO: 408, SEQ ID 
NO: 409, SEQ ID NO: 410, SEQ ID NO: 413, SEQ ID NO: 415, SEQ ID NO: 
418, SEQ ID NO: 421, SEQ ID NO: 423, SEQ ID NO: 424, SEQ ID NO: 426, 
SEQ ID NO: 427, SEQ ID NO: 428, SEQ ID NO: 429, SEQ ID NO: 430, a 
fragment thereof or a degenerate variant thereof. 

33. The polypeptide of claim 28, wherein the polypeptide identified by Blastp 
analysis comprises an amino acid sequence chosen from one of SEQ ID NO: 
216, SEQ ID NO: 217, SEQ ID NO: 222, SEQ ID NO: 225, SEQ ID NO: 227, 
SEQ ID NO: 231, SEQ ID NO: 235, SEQ ID NO: 239, SEQ ID NO: 242, SEQ 
ID NO: 245, SEQ ID NO: 246, SEQ ID NO: 247, SEQ ID NO: 248, SEQ ID 
NO: 249, SEQ ID NO: 250, SEQ ID NO: 253, SEQ ID NO: 255, SEQ ID NO: 
257, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID NO: 263, SEQ ID NO: 266, 
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SEQ ID NO: 268, SEQ ID NO: 269, SEQ ID NO: 275, SEQ ID NO: 276, SEQ 
ID NO: 280, SEQ ID NO: 282, SEQ ID NO: 283, SEQ ID NO: 284, SEQ ID 
NO: 285, SEQ ID NO: 286, SEQ ID NO: 290, SEQ ID NO: 291, SEQ ID NO: 
292, SEQ ID NO: 293, SEQ ID NO: 294, SEQ ID NO: 295, SEQ ID NO: 302, 
SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 309, SEQ ID NO: 310, SEQ 
ID NO: 311, SEQ ID NO: 313, SEQ ID NO: 315, SEQ ID NO: 318, SEQ ID 
NO: 320, SEQ ID NO: 322, SEQ ID NO: 323, SEQ ID NO: 324, SEQ ID NO: 
327, SEQ ID NO: 328, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID NO: 333, 
SEQ ID NO: 337, SEQ ID NO: 338, SEQ ID NO: 339, SEQ ID NO: 342, SEQ 
ID NO: 344, SEQ ID NO: 346, SEQ ID NO: 347, SEQ ID NO: 348, SEQ ID 
NO: 349, SEQ ID NO: 350, SEQ ID NO: 351, SEQ ID NO: 353, SEQ ID NO: 
354, SEQ ID NO: 356, SEQ ID NO: 359, SEQ ID NO: 361, SEQ ID NO: 362, 
SEQ ID NO: 366 , SEQ ID NO: 367, SEQ ID NO: 369, SEQ ID NO: 370, SEQ 
ID NO: 372, SEQ ID NO: 373, SEQ ID NO: 374, SEQ ID NO: 375, SEQ ID 
NO: 376, SEQ ID NO: 377, SEQ ID NO: 378, SEQ ID NO: 380, SEQ ID NO: 
381, SEQ ID NO: 382, SEQ ID NO: 384, SEQ ID NO: 387, SEQ ID NO: 388, 
SEQ ID NO: 391, SEQ ID NO: 392, SEQ ID NO: 393, SEQ ID NO: 395, SEQ 
ID NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, SEQ ID NO: 400, SEQ ID 
NO: 401, SEQ ID NO: 403, SEQ ID NO: 404, SEQ ID NO: 406, SEQ ID NO: 
408, SEQ ID NO: 41 1, SEQ ID NO: 412, SEQ ID NO: 413, SEQ ID NO: 414, 
SEQ ID NO: 415, SEQ ID NO: 416, SEQ ID NO: 417, SEQ ID NO: 419, SEQ 
ID NO: 420, SEQ ID NO: 421, SEQ ID NO: 422, SEQ ID NO: 423, SEQ ID 
NO: 425, SEQ ID NO: 427, SEQ ID NO: 428, SEQ ID NO: 429, a fragment 
thereof or a degenerate variant thereof. 

34. The polypeptide of claim 28, wherein the polypeptide identified by Pfam 
analysis comprises an amino acid sequence chosen from one of SEQ ID NO: 
219, SEQ ID NO: 233, SEQ ID NO: 234, SEQ ID NO: 255, SEQ ID NO: 260, 
SEQ ID NO: 270, SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 278, SEQ 
ID NO: 279, SEQ ID NO: 281, SEQ ID NO: 287, SEQ ID NO: 289, SEQ ID 
NO: 304, SEQ ID NO: 307, SEQ ID NO: 319, SEQ ID NO: 326, SEQ ID NO: 
331, SEQ ID NO: 334, SEQ ID NO: 343, SEQ ID NO: 352, SEQ ID NO: 357, 
SEQ ID NO: 358, SEQ ID NO: 364, SEQ ID NO: 366, SEQ ID NO: 367, SEQ 
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ID NO: 368, SEQ ID NO: 372, SEQ ID NO: 374, SEQ ID NO: 375, SEQ ID 
NO: 377, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 380, SEQ ID NO: 
381, SEQ ID NO: 384, SEQ ID NO: 386, SEQ ID NO: 389, SEQ ID NO: 391, 
SEQ ID NO: 395, SEQ ID NO: 397, SEQ ID NO: 398, SEQ ID NO: 399, SEQ 
ID NO: 401, SEQ ID NO: 403, SEQ ID NO 404, SEQ ID NO: 410, SEQ ID 
NO: 413, SEQ ID NO 414, SEQ ID NO: 420, SEQ ID NO: 427, SEQ ID NO: 
428, a fragment thereof or a degenerate variant thereof. 

35. The polypeptide of claim 28, wherein the polypeptide is a lipoprotein and 
comprises an amino acid sequence chosen from one of SEQ ID NO: 218, 
SEQ ID NO: 223, SEQ ID NO: 224, SEQ ID NO: 228, SEQ ID NO: 236, SEQ 
ID NO: 241, SEQ ID NO: 249, SEQ ID NO: 277, SEQ ID NO: 282, SEQ ID 
NO: 300, SEQ ID NO: 349, SEQ ID NO: 362, SEQ ID NO: 365, SEQ ID NO: 
383, SEQ ID NO: 385, SEQ ID NO: 388, a fragment thereof or a degenerate 
variant thereof. 

36. The polypeptide of claim 28, wherein the polypeptide having a LPXTG motif 
and covalently attached to the peptidoglycan layer comprises an amino acid 
sequence chosen from one of SEQ ID NO: 228, SEQ ID NO: 236, SEQ ID 
NO: 249, SEQ, SEQ ID NO: 385, a fragment thereof or a degenerate variant 
thereof. 

37. The polypeptide of claim 28, wherein the polypeptide having a peptidoglycan 
binding motif and associated with the peptidoglycan layer comprises an 
amino acid sequence selected from one of SEQ ID NO: 240, SEQ ID NO: 
264, SEQ ID NO: 325, a fragment thereof or a degenerate variant thereof. 

38. The polypeptide of claim 28, wherein the polypeptide having a signal 
sequence and a C-terminal Tyrosine or Phenylalanine amino acid comprises 
an amino acid sequence chosen from one of SEQ ID NO:226, SEQ ID 
NO:254, SEQ ID NO:289, SEQ ID NO:312, SEQ ID NO:321, SEQ ID NO: 
340, SEQ ID NO:402, a fragment thereof or a degenerate variant thereof. 
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39. The polypeptide of claim 28, wherein the polypeptide having a tripeptide RGD 
sequence that potentially is involved in cell attachment comprises an amino 
acid sequence chosen from one of SEQ ID NO:216, SEQ ID NO:236, SEQ ID 
NO:281 , SEQ ID NO:282, a fragment thereof or a degenerate variant thereof. 

40. The polypeptide of claim 28, wherein the polypeptide identified by proteomics 
as surface exposed comprises an amino acid sequence chosen from one of 
SEQ ID NO: 229, SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 261, SEQ 
ID NO: 279, SEQ ID NO: 281, SEQ ID NO: 282, SEQ ID NO: 284, SEQ ID 
NO: 286, SEQ ID NO: 289, SEQ ID NO: 306, SEQ ID NO: 318, SEQ ID NO: 
331, SEQ ID NO: 343, SEQ ID NO: 346, SEQ ID NO: 351, SEQ ID NO: 366, 
SEQ ID NO: 371, SEQ ID NO: 374, SEQ ID NO: 377, SEQ ID NO: 379, SEQ 
ID NO: 387, SEQ ID NO: 391, SEQ ID NO: 393, SEQ ID NO: 394, 395, SEQ 
ID NO: 397, SEQ ID NO: 420, a fragment thereof or a degenerate variant 
thereof. 

41 . The polypeptide of claim 28, wherein the polypeptide identified by proteomics 
as surface exposed comprises an amino acid sequence chosen from one of 
SEQ ID NO:592 through SEQ ID NO: 752, a fragment thereof or a 
degenerate variant thereof. 

42. An isolated polypeptide comprising an amino acid sequence having at least 
about 95% identity to an amino acid sequence chosen from one of SEQ ID 
NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ ID 
NO:752. 

43. The polypeptide of claim 42, wherein the polypeptide is a fusion polypeptide 

44. The polypeptide of claim 42, which immunoreacts with seropositive serum of 
an individual infected with Streptococcus pneumoniae. 

45. The polypeptide of claim 42, further defined as: 
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(a) a Streptococcus pneumoniae polypeptide having 0, 1 or 2 
transmembrane domains; 

(b) a Streptococcus pneumoniae polypeptide having 3 or more 
transmembrane domains; 

(c) a Streptococcus pneumoniae polypeptide having an outer membrane 
domain or a periplasmic domain; 

(d) a Streptococcus pneumoniae polypeptide having an inner membrane 
domain; 

(e) a Streptococcus pneumoniae polypeptide identified by Blastp analysis; 

(f) a Streptococcus pneumoniae polypeptide identified by Pfam analysis; 

(g) a Streptococcus pneumoniae lipoprotein; 

(h) a Streptococcus pneumoniae polypeptide having a LPXTG motif, 
wherein the polypeptide is covalently attached to the peptidoglycan 
layer; 

(i) a Streptococcus pneumoniae polypeptide having a peptidoglycan 
binding motif, wherein the polypeptide is associated with the 
peptidoglycan layer; 

(j) a Streptococcus pneumoniae polypeptide having a signal sequence 
and a C-terminal Tyrosine or a C-terminal Phenylalanine amino acid; 

(k) a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
amino acid sequence; 

(I) a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed; 

and 

(m) a Streptococcus pneumoniae polypeptide identified by proteomics as 
membrane associated. 

46. The polypeptide of claim 45, wherein the polypeptide having a 0, 1 or 2 
transmembrane domains comprises an amino acid sequence chosen from 
one of SEQ ID NO: 216, SEQ ID NO: 218, SEQ ID NO: 219, SEQ ID NO: 
222, SEQ ID NO: 223, SEQ ID NO: 224, SEQ ID NO: 226, SEQ ID NO: 228, 
SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 233, SEQ ID NO: 234, SEQ 
ID NO: 237, SEQ ID NO: 238, SEQ ID NO: 239, SEQ ID NO: 240, SEQ ID 
NO: 243, SEQ ID NO: 244, SEQ ID NO: 247, SEQ ID NO: 249, SEQ ID NO: 
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251, SEQ ID NO: 254, SEQ ID NO: 256, SEQ ID NO: 257, SEQ ID NO: 260, 
SEQ ID NO: 262, SEQ ID NO: 264, SEQ ID NO: 265, SEQ ID NO: 266, SEQ 
ID NO: 268, SEQ ID NO: 270, SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID 
NO: 275, SEQ ID NO: 276, SEQ ID NO: 277, SEQ ID NO: 278, SEQ ID NO: 
279, SEQ ID NO: 281, SEQ ID NO: 282, SEQ ID NO: 283, SEQ ID NO: 284, 
SEQ ID NO: 285, SEQ ID NO: 286, SEQ ID NO: 287, SEQ ID NO: 289, SEQ 
ID NO: 293, SEQ ID NO: 294, SEQ ID NO: 296, SEQ ID NO: 298, SEQ ID 
NO: 300, SEQ ID NO: 301, SEQ ID NO: 304, SEQ ID NO: 306, SEQ ID NO: 
307, SEQ ID NO: 310, SEQ ID NO: 311, SEQ ID NO: 312, SEQ ID NO: 315, 
SEQ ID NO: 319, SEQ ID NO: 320, SEQ ID NO: 321, SEQ ID NO: 324, SEQ 
ID NO: 325, SEQ ID NO: 326, SEQ ID NO: 328, SEQ ID NO: 331 , SEQ ID 
NO: 336, SEQ ID NO: 337, SEQ ID NO: 338, SEQ ID NO: 340, SEQ ID NO: 
341, SEQ ID NO: 342, SEQ ID NO: 343, SEQ ID NO: 346, SEQ ID NO: 347, 
SEQ ID NO: 349, SEQ ID NO: 351, SEQ ID NO: 352, SEQ ID NO: 353, SEQ 
ID NO: 356, SEQ ID NO: 357, SEQ ID NO: 358, SEQ ID NO: 359, SEQ ID 
NO: 362, SEQ ID NO: 363, SEQ ID NO: 364, SEQ ID NO: 365, SEQ ID NO: 
370, SEQ ID NO: 371, SEQ ID NO: 373, SEQ ID NO: 376, SEQ ID NO: 377, 
SEQ ID NO: 380, SEQ ID NO: 385, SEQ ID NO: 386, SEQ ID NO: 387, SEQ 
ID NO: 389, SEQ ID NO: 391, SEQ ID NO: 394, SEQ ID NO: 398, SEQ ID 
NO: 400, SEQ ID NO: 402, SEQ ID NO: 407, SEQ ID NO: 410, SEQ ID NO: 
411, SEQ ID NO: 412, SEQ ID NO: 414, SEQ ID NO: 415, SEQ ID NO: 416, 
SEQ ID NO: 417, SEQ ID NO: 419, SEQ ID NO: 420, SEQ ID NO: 422, SEQ 
ID NO: 424, SEQ ID NO: 425], a biological equivalent thereof, or a fragment 
thereof. 

47. The polypeptide of claim 45, wherein the polypeptide having 3 or more 
transmembrane domains comprises an amino acid sequence chosen from 
one of SEQ ID NO: 217, SEQ ID NO: 220, SEQ ID NO: 221, SEQ ID NO: 
225, SEQ ID NO: 227, SEQ ID NO: 229, SEQ ID NO: 230, SEQ ID NO: 235, 
SEQ ID NO: 236, SEQ ID NO: 241, SEQ ID NO: 242, SEQ ID NO: 245, SEQ 
ID NO: 246, SEQ ID NO: 248, SEQ ID NO: 250, SEQ ID NO: 252, SEQ ID 
NO: 253, SEQ ID NO: 255, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID NO: 
261, SEQ ID NO: 263, SEQ ID NO: 267, SEQ ID NO: 269, SEQ ID NO: 271, 
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SEQ ID NO: 274, SEQ ID NO: 280, SEQ ID NO: 286, SEQ ID NO: 290, SEQ 
ID NO: 291, SEQ ID NO: 292, SEQ ID NO: 295, SEQ ID NO: 297, SEQ ID 
NO: 299, SEQ ID NO: 302, SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 
308, SEQ ID NO: 309, SEQ ID NO: 313, SEQ ID NO: 314, SEQ ID NO: 316, 
SEQ ID NO: 317, SEQ ID NO: 318, SEQ ID NO: 322, SEQ ID NO: 323, SEQ 
ID NO: 327, SEQ ID NO: 329, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID 
NO: 333, SEQ ID NO: 334, SEQ ID NO: 335, SEQ ID NO: 339, SEQ ID NO: 
344, SEQ ID NO: 345, SEQ ID NO: 348, SEQ ID NO: 350, SEQ ID NO: 354, 
SEQ ID NO: 355, SEQ ID NO: 360, SEQ ID NO: 361, SEQ ID NO: 366, SEQ 
ID NO: 367, SEQ ID NO: 368, SEQ ID NO: 369, SEQ ID NO: 372, SEQ ID 
NO: 374, SEQ ID NO: 375, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 
381, SEQ ID NO: 382, SEQ ID NO: 383, SEQ ID NO: 384, SEQ ID NO: 388, 
SEQ ID NO: 390, SEQ ID NO: 392, SEQ ID NO: 393, SEQ ID NO: 395, SEQ 
ID NO: 396, SEQ, ID NO: 397, SEQ ID NO: 399, SEQ ID NO: 401, SEQ ID 
NO: 403, SEQ ID NO: 404, SEQ ID NO: 405, SEQ ID NO: 406, SEQ ID NO: 
408, SEQ ID NO: 409, SEQ ID NO: 413, SEQ ID NO: 418, SEQ ID NO: 421, 
SEQ ID NO: 423, SEQ ID NO: 426, SEQ ID NO: 427, SEQ ID NO: 428, SEQ 
ID NO: 429, SEQ ID NO: 430, a biological equivalent thereof, or a fragment 
thereof. 

48. The polypeptide of claim 45, wherein the polypeptide having an outer 
membrane domain or a periplasmic domain comprises an amino acid 
sequence chosen from one of SEQ ID NO: 218, SEQ ID NO: 223, SEQ ID 
NO: 224, SEQ ID NO: 238, SEQ ID NO: 254, SEQ ID NO: 265, SEQ ID NO: 
277, SEQ ID NO: 282, SEQ ID NO: 293, SEQ ID NO: 300, SEQ ID NO: 340, 
SEQ ID NO: 349, SEQ ID NO: 362, SEQ ID NO: 380, SEQ ID NO: 387, SEQ 
ID NO: 394, a biological equivalent thereof, or a fragment thereof. 

49. The polypeptide of claim 45, wherein the polypeptide having an inner 
membrane domain comprises an amino acid sequence chosen from one of 
SEQ ID NO: 217, SEQ ID NO: 220, SEQ ID NO: 221, SEQ ID NO: 222, SEQ 
ID NO: 225, SEQ ID NO: 226, SEQ ID NO: 227, SEQ ID NO: 228, SEQ ID 
NO: 229, SEQ ID NO: 230, SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 
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234, SEQ ID NO: 235, SEQ ID NO: 236, SEQ ID NO: 237, SEQ ID NO: 241, 
SEQ ID NO: 242, SEQ ID NO: 243, SEQ ID NO: 244, SEQ ID NO: 245, SEQ 
ID NO: 246, SEQ ID NO: 247, SEQ ID NO: 248, SEQ ID NO: 249, SEQ ID 
NO: 250, SEQ ID NO: 251, SEQ ID NO: 252, SEQ ID NO: 253, SEQ ID NO: 
255, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID NO: 261, SEQ ID NO: 262, 
SEQ ID NO: 263, SEQ ID NO: 266, SEQ ID NO: 267, SEQ ID NO: 268, SEQ 
ID NO: 269, SEQ ID NO: 271, SEQ ID NO: 274, SEQ ID NO: 275, SEQ ID 
NO: 276, SEQ ID NO: 280, SEQ ID NO: 283, SEQ ID NO: 284, SEQ ID NO: 
285, SEQ ID NO: 286, SEQ ID NO: 288, SEQ ID NO: 290, SEQ ID NO: 291, 
SEQ ID NO: 292, SEQ ID NO: 294, SEQ ID NO: 295, SEQ ID NO: 296, SEQ 
ID NO: 297, SEQ ID NO: 298, SEQ ID NO: 299, SEQ ID NO: 301 SEQ ID 
NO: 302, SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 306, SEQ ID NO: 
308, SEQ ID NO: 309, SEQ ID NO: 310, SEQ ID NO: 311, SEQ ID NO: 312, 
SEQ ID NO: 313, SEQ ID NO: 314, SEQ ID NO: 315, SEQ ID NO: 316, SEQ 
ID NO: 317, SEQ ID NO: 318, SEQ ID NO: 320, SEQ ID NO: 321, SEQ ID 
NO: 322, SEQ ID NO: 323, SEQ ID NO: 324, SEQ ID NO: 327, SEQ ID NO: 
328, SEQ ID NO: 329, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID NO: 333, 
SEQ ID NO: 334, SEQ ID NO: 335, SEQ ID NO: 336, SEQ ID NO: 337, SEQ 
ID NO: 338, SEQ ID NO: 339, SEQ ID NO: 341, SEQ ID NO: 342, SEQ ID 
NO: 343, SEQ ID NO: 344, SEQ ID NO: 345, SEQ ID NO: 346, SEQ ID NO: 
347, SEQ ID NO: 348, SEQ ID NO: 350, SEQ ID NO: 351, SEQ ID NO: 354, 
SEQ ID NO: 355, SEQ ID NO: 356, SEQ ID NO: 357, SEQ ID NO: 359, SEQ 
ID NO: 360, SEQ ID NO: 361, SEQ ID NO: 362, SEQ ID NO: 365, SEQ ID 
NO: 366, SEQ ID NO: 367, SEQ ID NO: 368, SEQ ID NO: 369, SEQ ID NO: 
371, SEQ ID NO: 372, SEQ ID NO: 373, SEQ ID NO: 374, SEQ ID NO: 375, 
SEQ ID NO: 377, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 381, SEQ 
ID NO: 382, SEQ ID NO: 383, SEQ ID NO: 384, SEQ ID NO: 385, SEQ ID 
NO: 388, SEQ ID NO: 390, SEQ ID NO: 391, SEQ ID NO: 392, SEQ ID NO: 
393, SEQ ID NO: 395, SEQ ID NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, 
SEQ ID NO: 401, SEQ ID NO: 402, SEQ ID NO: 403, SEQ ID NO: 404, SEQ 
ID NO: 405, SEQ ID NO: 406, SEQ ID NO: 407, SEQ ID NO: 408, SEQ ID 
NO: 409, SEQ ID NO: 410, SEQ ID NO: 413, SEQ ID NO: 415, SEQ ID NO: 
418, SEQ ID NO: 421, SEQ ID NO: 423, SEQ ID NO: 424, SEQ ID NO: 426, 
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SEQ ID NO: 427, SEQ ID NO: 428, SEQ ID NO: 429, SEQ ID NO: 430, a 
biological equivalent thereof, or a fragment thereof. 

50. The polypeptide of claim 45, wherein the polypeptide identified by Blastp 
analysis comprises an amino acid sequence chosen from one of SEQ ID NO: 
216, SEQ ID NO: 217, SEQ ID NO: 222, SEQ ID NO: 225, SEQ ID NO: 227, 
SEQ ID NO: 231, SEQ ID NO: 235, SEQ ID NO: 239, SEQ ID NO: 242, SEQ 
ID NO; 245, SEQ ID NO: 246, SEQ ID NO: 247, SEQ ID NO: 248, SEQ ID 
NO: 249, SEQ ID NO: 250, SEQ ID NO: 253, SEQ ID NO: 255, SEQ ID NO: 
257, SEQ ID NO: 258, SEQ ID NO: 259, SEQ ID NO: 263, SEQ ID NO: 266, 
SEQ ID NO: 268, SEQ ID NO: 269, SEQ ID NO: 275, SEQ ID NO: 276, SEQ 
ID NO: 280, SEQ ID NO: 282, SEQ ID NO: 283, SEQ ID NO: 284, SEQ ID 
NO: 285, SEQ ID NO: 286, SEQ ID NO: 290, SEQ ID NO: 291, SEQ ID NO: 
292, SEQ ID NO: 293, SEQ ID NO: 294, SEQ ID NO: 295, SEQ ID NO: 302, 
SEQ ID NO: 303, SEQ ID NO: 305, SEQ ID NO: 309, SEQ ID NO: 310, SEQ 
ID NO: 311, SEQ ID NO: 313, SEQ ID NO: 315, SEQ ID NO: 318, SEQ ID 
NO: 320, SEQ ID NO: 322, SEQ ID NO: 323, SEQ ID NO: 324, SEQ ID NO: 
327, SEQ ID NO: 328, SEQ ID NO: 330, SEQ ID NO: 332, SEQ ID NO: 333, 
SEQ ID NO: 337, SEQ ID NO: 338, SEQ ID NO: 339, SEQ ID NO: 342, SEQ 
ID NO: 344, SEQ ID NO: 346, SEQ ID NO: 347, SEQ ID NO: 348, SEQ ID 
NO: 349, SEQ ID NO: 350, SEQ ID NO: 351, SEQ ID NO: 353, SEQ ID NO: 
354, SEQ ID NO: 356, SEQ ID NO: 359, SEQ ID NO: 361, SEQ ID NO: 362, 
SEQ ID NO: 366 , SEQ ID NO: 367, SEQ ID NO: 369, SEQ ID NO: 370, SEQ 
ID NO: 372, SEQ ID NO: 373, SEQ ID NO: 374, SEQ ID NO: 375, SEQ ID 
NO: 376, SEQ ID NO: 377, SEQ ID NO: 378, SEQ ID NO: 380, SEQ ID NO: 
381, SEQ ID NO: 382, SEQ ID NO: 384, SEQ ID NO: 387, SEQ ID NO: 388, 
SEQ ID NO: 391, SEQ ID NO: 392, SEQ ID NO: 393, SEQ ID NO: 395, SEQ 
ID NO: 396, SEQ ID NO: 397, SEQ ID NO: 399, SEQ ID NO: 400, SEQ ID 
NO: 401, SEQ ID NO: 403, SEQ ID NO: 404, SEQ ID NO: 406, SEQ ID NO: 
408, SEQ ID NO: 411, SEQ ID NO: 412, SEQ ID NO: 413, SEQ ID NO: 414, 
SEQ ID NO: 415, SEQ ID NO: 416, SEQ ID NO: 417, SEQ ID NO: 419, SEQ 
ID NO: 420, SEQ ID NO: 421, SEQ ID NO: 422, SEQ ID NO: 423, SEQ ID 
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NO: 425, SEQ ID NO: 427, SEQ ID NO: 428, SEQ ID NO: 429, a biological 
equivalent thereof, or a fragment thereof. 

51. The polypeptide of claim 45, wherein the polypeptide identified by Pfam 
analysis comprises an amino acid sequence chosen from one of SEQ ID NO: 
219, SEQ ID NO: 233, SEQ ID NO: 234, SEQ ID NO: 255, SEQ ID NO: 260, 
SEQ ID NO: 270, SEQ ID NO: 272, SEQ ID NO: 273, SEQ ID NO: 278, SEQ 
ID NO: 279, SEQ ID NO: 281, SEQ ID NO: 287, SEQ ID NO: 289, SEQ ID 
NO: 304, SEQ ID NO: 307, SEQ ID NO: 319, SEQ ID NO: 326, SEQ ID NO: 
331, SEQ ID NO: 334, SEQ ID NO: 343, SEQ ID NO: 352, SEQ ID NO: 357, 
SEQ ID NO: 358, SEQ ID NO: 364, SEQ ID NO: 366, SEQ ID NO: 367, SEQ 
ID NO: 368, SEQ ID NO: 372, SEQ ID NO: 374, SEQ ID NO: 375, SEQ ID 
NO: 377, SEQ ID NO: 378, SEQ ID NO: 379, SEQ ID NO: 380, SEQ ID NO: 
381, SEQ ID NO: 384, SEQ ID NO: 386, SEQ ID NO: 389, SEQ ID NO: 391, 
SEQ ID NO: 395, SEQ ID NO: 397, SEQ ID NO: 398, SEQ ID NO: 399, SEQ 
ID NO: 401, SEQ ID NO: 403, SEQ ID NO 404, SEQ ID NO: 410, SEQ ID 
NO: 413, SEQ ID NO 414, SEQ ID NO: 420, SEQ ID NO: 427, SEQ ID NO: 
428, a biological equivalent thereof, or a fragment thereof. 

52. The polypeptide of claim 45, wherein the polypeptide is a lipoprotein, the 
polypeptide comprises an amino acid sequence chosen from one of SEQ ID 
NO: 218, SEQ ID NO: 223, SEQ ID NO: 224, SEQ ID NO: 228, SEQ ID NO: 
236, SEQ ID NO: 241, SEQ ID NO: 249, SEQ ID NO: 277, SEQ ID NO: 282, 
SEQ ID NO: 300, SEQ ID NO: 349, SEQ ID NO: 362, SEQ ID NO: 365, SEQ 
ID NO: 383, SEQ ID NO: 385, SEQ ID NO: 388, a biological equivalent 
thereof, or a fragment thereof. 

53. The polypeptide of claim 45, wherein the polypeptide having a LPXTG motif 
and covalently associated with the peptidoglycan layer comprises an amino 
acid sequence chosen from one of SEQ ID NO: 228, SEQ ID NO: 236, SEQ 
ID NO: 249, SEQ, SEQ ID NO: 385, a biological equivalent thereof, or a 
fragment thereof. 
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54. The polypeptide of claim 45, wherein the polypeptide having a peptidoglycan 
binding motif and associated with the peptidoglycan layer comprises an 
amino acid sequence chosen from one of SEQ ID NO: 240, SEQ ID NO: 264, 
SEQ ID NO: 325, a biological equivalent thereof, or a fragment thereof. 

55. The polypeptide of claim 45, wherein the polypeptide having a signal 
sequence and a C-terminal Tyrosine or Phenylalanine amino acid comprises 
an amino acid sequence chosen from one of SEQ ID NO:226, SEQ ID 
NO:254, SEQ ID NO:289, SEQ ID NO:312, SEQ ID NO:321, SEQ ID NO: 
340, SEQ ID NO:402, a biological equivalent thereof, or a fragment thereof. 

56. The polypeptide of claim 45, wherein the polypeptide having a tripeptide RGD 
sequence that potentially is involved in cell attachment comprises an amino 
acid sequence chosen from one of SEQ ID NO:216, SEQ ID NO:236, SEQ ID 
NO:281, SEQ ID NO:282, a biological equivalent thereof, or a fragment 
thereof. 

57. The polypeptide of claim 45, wherein the polypeptide identified by proteomics 
as surface exposed comprises an amino acid sequence chosen from one of 
SEQ ID NO: 229, SEQ ID NO: 231, SEQ ID NO: 232, SEQ ID NO: 261, SEQ 
ID NO: 279, SEQ ID NO: 281, SEQ ID NO: 282, SEQ ID NO: 284, SEQ ID 
NO: 286, SEQ ID NO: 289, SEQ ID NO: 306, SEQ ID NO: 318, SEQ ID NO: 
331, SEQ ID NO: 343, SEQ ID NO: 346, SEQ ID NO: 351, SEQ ID NO: 366, 
SEQ ID NO: 371, SEQ ID NO: 374, SEQ ID NO: 377, SEQ ID NO: 379, SEQ 
ID NO: 387, SEQ ID NO: 391, SEQ ID NO: 393, SEQ ID NO: 394, 395, SEQ 
ID NO: 397, SEQ ID NO: 420, a biological equivalent thereof, or a fragment 
thereof. 

58. The polypeptide of claim 45, wherein the polypeptide identified by proteomics 
as surface exposed comprises an amino acid sequence chosen from of one 
of SEQ ID NO:592 through SEQ ID NO:752, a biological equivalent thereof, 
or a fragment thereof. 
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59. A recombinant expression vector comprising a nucleotide sequence having at 
least about 95% identity to a nucleotide sequence chosen from one of SEQ 
ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 
591 , a degenerate variant thereof, or a fragment thereof. 

60. The vector of claim 59, wherein the polynucleotide is selected from the group 
consisting of DNA, chromosomal DNA, cDNA, RNA and antisense RNA. 

61 . The vector of claim 60, wherein the polynucleotide comprises heterologous 
nucleotide sequences. 

62. The vector of claim 61 , wherein the polynucleotide is operatively linked to one 
or more gene expression regulatory elements. 

63. The vector of claim 62, wherein the polynucleotide encodes a polypeptide 
comprising an amino acid sequence having at least about 95% identity to an 
amino acid sequence chosen from one of SEQ ID NO: 216 through SEQ ID 
NO: 430 or SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent 
thereof, or a fragment thereof. 

64. The vector of claim 59, wherein the vector is a plasmid. 

65. A genetically engineered host cell, transfected, transformed or infected with 
the vector of claim 59. 

66. The host cell of claim 65, wherein the host cell is a bacterial cell. 

67. The host cell of claim 66, wherein the polynucleotide is expressed to produce 
the encoded polypeptide, a biological equivalent thereof, or a fragment 
thereof. 

68. An antibody specific for a Streptococcus pneumoniae polynucleotide chosen 
from one of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 
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through SEQ ID NO: 591 , a fragment thereof, a degenerate variant thereof, or 
a Streptococcus pneumoniae polypeptide chosen from one of SEQ ID NO: 
216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ ID NO: 752, a 
biological equivalent thereof, or a fragment thereof. 

69. The antibody of claim 68, wherein the antibody is selected from the group 
consisting of monoclonal, polyclonal, chimeric, humanized and single chain. 

70. The antibody of claim 69, wherein the antibody is monoclonal. 

71 . The antibody of claim 70, wherein the antibody is humanized. 

72. An immunogenic composition comprising a polypeptide having an amino acid 
sequence chosen from one or more of SEQ ID NO: 216 through SEQ ID NO: 
430 or SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent 
thereof, or a fragment thereof. 

73. The immunogenic composition of claim 72, further comprising a 
pharmaceutically acceptable carrier. 

74. The immunogenic composition of claim 72, further comprising one or more 
adjuvants. 

75. The immunogenic composition of claim 72, wherein the polypeptide is further 
defined as: 

(a) a Streptococcus pneumoniae polypeptide having 0, 1 or 2 
transmembrane domains; 

(b) a Streptococcus pneumoniae polypeptide having 3 or more 
transmembrane domains; 

(c) a Streptococcus pneumoniae polypeptide having an outer membrane 
domain or a periplasmic domain; 

(d) a Streptococcus pneumoniae polypeptide having an inner membrane 
domain; 
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(e) a Streptococcus pneumoniae polypeptide identified by Blastp analysis; 

(f) a Streptococcus pneumoniae polypeptide identified by Pfam analysis; 

(g) a Streptococcus pneumoniae lipoprotein; 

(h) a Streptococcus pneumoniae polypeptide having a LPXTG motif, 
wherein the polypeptide is covalently attached to the peptidoglycan 
layer; 

(i) a Streptococcus pneumoniae polypeptide having a peptidoglycan 
binding motif, wherein the polypeptide is associated with the 
peptidoglycan layer; 

(j) a Streptococcus pneumoniae polypeptide having a signal sequence 
and a C-terminal Tyrosine or a C-terminal Phenylalanine amino acid; 

(k) a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
amino acid sequence; 

(I) a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed; 

and 

(m) a Streptococcus pneumoniae polypeptide identified by proteomics as 
membrane associated. 

76. The immunogenic composition of claim 75, wherein the polypeptide further 
comprises heterologous amino adds. 

77. The immunogenic composition of claim 75, wherein the polypeptide is a 
fusion polypeptide. 

78. The immunogenic composition of claim 75, wherein the polypeptide is 
encoded by a polynucleotide comprising a nucleotide sequence having at 
least about 95% identity to a nucleotide sequence chosen from one of SEQ 
ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 
591 , a degenerate variant thereof, or a fragment thereof. 

79. The immunogenic composition of claim 78, wherein the polynucleotide further 
comprises heterologous nucleotides. 



-172- 



WO 02/083855 



PCT/US02/11524 



80. An immunogenic composition comprising a polynucleotide having a 
nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID NO: 
215 or SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant 
thereof, or a fragment thereof and is comprised in an expression vector. 

81 . The immunogenic composition of claim 80, wherein the vector is plasmid 
DNA. 

82. The immunogenic composition of claim 81, wherein the polynucleotide 
comprises heterologous nucleotides. 

83. The immunogenic composition of claim 82, wherein the polynucleotide is 
operatively linked to one or more gene expression regulatory elements. 

84. The immunogenic composition of claim 83, wherein the polynucleotide directs 
the expression of a neutralizing epitope of Streptococcus pneumoniae. 

85. The immunogenic composition of claim 84, further comprising one or more 
adjuvants. 

86. A pharmaceutical composition comprising a polypeptide and a 
pharmaceutical^ acceptable carrier, wherein the polypeptide comprises an 
amino acid chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or 
SEQ ID NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or 
a fragment thereof. 

87. The pharmaceutical composition of claim 86, wherein the polypeptide is 
further defined as: 

(a) a Streptococcus pneumoniae polypeptide having 0, 1 or 2 
transmembrane domains; 

(b) a Streptococcus pneumoniae polypeptide having 3 or more 
transmembrane domains; 



-173- 



WO 02/083855 



PCT/US02/11524 



(c) a Streptococcus pneumoniae polypeptide having an outer membrane 
domain or a periplasmic domain; 

(d) a Streptococcus pneumoniae polypeptide having an inner membrane 
domain; 

(e) a Streptococcus pneumoniae polypeptide identified by Blastp analysis; 

(f) a Streptococcus pneumoniae polypeptide identified by Pfam analysis; 

(g) a Streptococcus pneumoniae lipoprotein; 

(h) a Streptococcus pneumoniae polypeptide having a LPXTG motif, 
wherein the polypeptide is covalently attached to the peptidoglycan 
layer; 

(i) a Streptococcus pneumoniae polypeptide having a peptidoglycan 
binding motif, wherein the polypeptide is associated with the 
peptidoglycan layer; 

(j) a Streptococcus pneumoniae polypeptide having a signal sequence 
and a C-terminal Tyrosine or a C-terminal Phenylalanine amino acid; 

(k) a Streptococcus pneumoniae polypeptide having a tripeptide RGD 
amino acid sequence; 

(I) a Streptococcus pneumoniae polypeptide identified by proteomics as 
surface exposed; 

and 

(m) a Streptococcus pneumoniae polypeptide identified by proteomics as 
membrane associated. 

88. The pharmaceutical composition of claim 87, wherein the polypeptide further 
comprises heterologous amino acids. 

89. The pharmaceutical composition of claim 87, wherein the polypeptide is a 
fusion polypeptide. 

90. A DNA chip comprising an array of polynucleotides, wherein at least one of 
the polynucleotides comprise a nucleotide sequence chosen from one of SEQ 
ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 
591, a complement thereof, a degenerate variant thereof, or a fragment 
thereof. 

-174- 



WO 02/083855 



PCT/US02/11524 



91 . A protein chip comprising an array of polypeptides, wherein at least one of the 
polypeptides comprises an amino acid sequence chosen from one of SEQ ID 
NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ ID NO: 
752, a biological equivalent thereof, or a fragment thereof. 

92. A method of immunizing against Streptococcus pneumoniae comprising 
administering to a host an immunizing amount of an immunogenic 
composition comprising a polypeptide and a pharmaceutically acceptable 
carrier, wherein the polypeptide comprises an amino acid sequence chosen 
from one or more of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID 
NO: 592 through SEQ ID NO: 752, a biological equivalent thereof, or a 
fragment thereof. 

93. The method of claim 92, wherein the polypeptide is a fusion polypeptide. 

94. The method of claim 92, further comprising an adjuvant. 

95. A method for the detection and/or identification of Streptococcus pneumoniae 
in a biological sample comprising: 

(a) contacting the sample with an oligonucleotide probe of a 
polynucleotide comprising the nucleotide sequence chosen from one 
of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 
through SEQ ID NO: 591, a degenerate variant thereof, or a fragment 
thereof, under conditions permitting hybridization; and 

(b) detecting the presence of hybridization complexes in the sample, 
wherein hybridization complexes indicate the presence of 
Streptococcus pneumoniae in the sample. 

96. A method for the detection and/or identification of Streptococcus pneumoniae 
in a biological sample comprising: 

(a) contacting the sample with an oligonucleotide primer of a 
polynucleotide comprising the nucleotide sequence chosen from one 
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of SEQ ID NO: 1 through SEQ ID NO: 215 or SEQ ID NO: 431 
through SEQ ID NO: 591, a degenerate variant thereof, or a fragment 
thereof, in the presence of nucleotides and a polymerase enzyme 
under conditions permitting primer extension; and 
(b) detecting the presence of primer extension products in the sample, 
wherein extension products indicate the presence of Streptococcus 
pneumoniae in the sample. 

97. A method for the detection and/or identification of Streptococcus pneumoniae 
in a biological sample comprising: 

(a) contacting the sample with an antibody specific for a polypeptide 
comprising an amino acid sequence chosen from one of SEQ ID 
NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through SEQ 
ID NO: 752, a biological equivalent thereof, or a fragment thereof, 
under conditions permitting immune complex formation; and 

(b) detecting the presence of immune complexes in the sample, wherein 
immune complexes indicate the presence of Streptococcus 
pneumoniae in the sample. 

98. A method for the detection and/or identification of antibodies to Streptococcus 
pneumoniae in a biological sample comprising: 

(a) contacting the sample with a polypeptide comprising an amino acid 
sequence chosen from one of SEQ ID NO: 216 through SEQ ID NO: 
430 or SEQ ID NO: 592 through SEQ ID NO: 752, a biological 
equivalent thereof, or a fragment thereof, under conditions permitting 
immune complex formation; and 

(b) detecting the presence of immune complexes in the sample, wherein 
immune complexes indicate the presence of Streptococcus 
pneumoniae in the sample. 

99. A kit comprising a container containing an isolated polynucleotide comprising 
an nucleotide sequence chosen from one of SEQ ID NO: 1 through SEQ ID 
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NO: 215 or SEQ ID NO: 431 through SEQ ID NO: 591, a degenerate variant 
thereof, or a fragment thereof. 

1 00. The kit of claim 99, wherein the polynucleotide is a primer or a probe. 

101. The kit of claim 1 00, wherein the polynucleotide is a primer and the kit further 
comprises a container containing a polymerase. 

102. The kit of claim 99, wherein the kit further comprises a container containing 
dNTP. 

103. A kit comprising a container containing an antibody that immunospecifically 
binds to a polypeptide comprising the amino acid sequence chosen from one 
of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 592 through 
SEQ ID NO: 752, a biological equivalent thereof, or a fragment thereof. 

104. A kit comprising a container containing an antibody that immunospecifically 
binds to a fusion polypeptide comprising at least the amino acid sequence 
chosen from one of SEQ ID NO: 216 through SEQ ID NO: 430 or SEQ ID NO: 
592 through SEQ ID NO: 752, a biological equivalent thereof, or a fragment 
thereof. 

105. A method for producing a polypeptide which comprises culturing the 
genetically engineered host cell of claim 66 under conditions suitable to 
produce the polypeptide and recovering the polypeptide from the culture. 
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SEQUENCE LISTING 

<110> Wyeth 

<120> NOVEL STREPTOCOCCUS PNEUMONIAE OPEN READING FRAMES ENCODING 
POLYPEPTIDE ANTIGENS AND USES THEREOF 

<130> AM100649-PCT 

<160> 752 

<170> Patentln version 3.1 

<210> 1 

<211> 684 

<212> DNA 

<213> Streptococcus pneumoniae 



<400> 1 
gctcgggcta 


aatcagtcca 


ctggactgat 


ttactacacc 


agtatagctt 


caagctctgt 


60 


cagaaacgat 


tctatcagcc 


cacgtttcga 


atgcacttaa 


cccatcggga 


agtacgagat 


120 


aagctgcttt 


cttactctga 


gggattacag 


gttcactacg 


aactctatca 


actcctgctc 




tttcattttc 


aagagaagaa 


tgccgaccat 


ttctttggat 


tgattgagca 


agaactgcca 


240 


acggttcatc 


cgctttttca 


aacggtcttt 


tggacttttt 


taagggatag 


agataagatt 


300 


atcaacgcac 


ttaagctgcc 


ttattccaac 


gctaaacttg 


aagcgaccaa 


taatttgatt 


360 


aagattatca 


agcgcaaagc 


ctttggtttc 


cggaacttta 


acaattttaa 


aaaacggatt 


420 


ttgatgactt 


tgaacatcaa 


aaaagagagt 


acgaatttcg 


tactctccag 


attgcagctt 


480 


ttcgcctacc 


cactacactt 


gacaaagagc 


cactctttat 


tccatggtat 


caaaggcaag 


540 


acttggtttg 


gcattgaggt 


cccagcctgc 


gaagttttct 


ttgttccact 


cgctgacgct 


600 


ggcataggca 


atcatacctg 


cattgtctcc 


gcagagtcgc 


agagggggga 


tgataacctt 


660 


gacatctgtg 


atttcggctg 


ctag 








684 


<210> 2 
<211> 675 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 2 

gagggggcgc aggcagccat 


gccaacggct 


cttggctatg 


tcagtatcgg 


cctggcctgt 


60 


ggaattatcg gtgcgcccta 


tgtgacacct 


gttgagatgg 


gcttgatgag 


tctctttgtt 


120 
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ttttgttgag 


































tggatgcacg 


gaaacaatct 


taacagctat 


gfcggcttggt 


ttgtggggac 


agtagtcgga 




acggctctgg 


gtggcctgct 


accaaatcca 


gaaatctttg 


gcttggattt 


tgccctggtt 


480 


gggatgttta 


ttggtatttt 


tgcttcgcaa 


tttcagatta 


tgcaaagacg 


gattcctgtc 


540 


cgcaatctgc 


tcattatcct 


agcagttgtt 


gcggtgtcct 


tctttttgct 


cttgacagtg 


600 


atgtctcagt 


cactagctgt 


tctgtttgcg 


acgctacttg 


gttgtagcat 


gggggtggtt 


660 


ttagatggtc 


agtaa 










675 



<210> 3 

<211> 864 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 3 



gattataagg 


tattctattt 


tggaggaaat 


gacatgaaaa 


aaatcgttaa 


atactcatct 


60 


cttgcagccc 


ttgctcttgt 


tgctgcaggt 


gtgcttgcgg 


cttgctcagg 


gggtgctaag 


120 


aaagaaggag 


aagcagctag 


caagaaagaa 


atcatcgttg 


caaccaatgg 


atcaccaaag 


180 


ccatttatct 


atgaagaaaa 


tggcgaattg 


actggttacg 


agattgaagt 


cgttcgcgct 


240 


atctttaaag 


attctgacaa 


atatgatgtc 


aagtttgaaa 


agacagaatg 


gtcaggtgtc 


300 


tttgctggtc 


ttgacgctga 


tcgttacaat 


atggctgtca 


acaatcttag 


ctacactaaa 


360 


gaacgtgcgg 


agaaatacct 


ctatgccgca 


ccaattgccc 


aaaatcctaa 


tgtccttgtc 


420 


gtgaagaaag 


atgactctag 


tatcaagtct 


ctcgatgata 


tcggtggaaa 


atcgacggaa 


480 


gtcgttcaag 


ccactacatc 


agctaagcag 


ttagaagcat 


acaatgctga 


acacacggac 


540 


aacccaacta 


tccttaacta 


tactaaggca 


gacttgcaac 


aaatcatggt 


acgtttgagc 


600 


gatggacaat 


ttgactataa 


gatttttgat 


aaaatcggtg 


ttgaaacagt 


gatcaagaac 


660 


caaggtttgg 


acaacttgaa 


agttatcgaa 


cttccaagcg 


accaacaacc 


gtacgtttac 


720 


ccacttcttg 


ctcagggtca 


agatgagttg 


aaatcgtttg 


tagacaaacg 


catcaaagaa 


780 


ctttataaag 


atggaactct 


tgaaaaattg 


tctaaacaat 


tcttcggaga 


cacttatcta 


840 
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ccggcagaag ctgatattaa ataa 

<210> 4 
<211> 1389 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 4 

aaaggtagag agaatatggt ttttcctagc gaacaagaac 
gatcatgtag cccagcatta ttttgaggtt ttgcgtacct 
tttgcccagc aggttggact caaggaagtc gcaaattatc 
gttggagctg aagtggagat tgatgagagc tatacagcgc 
aagagttcgc gtccagatgc caagaccttg attttctata 
gcggatgggg atcaggtctg gacagaggat ccttttacgc 
atgtatgggc gtggggttga tgacgacaag ggtcatatca 
agaaaatata tgcagcacca tgatgattta cctgtcaata 
gcggaggaat cggcttcaac agacctagat aagtatttgg 
cgtggggcgg atttgttggt ctgggaacaa gggaccaaaa 
atttctggtg gcaataaggg gattgtgacc tttgatgcca 
gatatccact cgagttatgg tggtgttgtg gaatcagctc 
ttacagtctc ttcgtgctgc ggatggccgt atcttggttg 
caagagccca atgaacgaga aatggccttg ctagaaactt 
gaagttagtc ggatttatgg attggagttg cctctcttac 
ctaaaacgtt tctttttcga tccagcgctt aatatcgaag 
ggtcagggtg ttaagactat tttacctgca gaagccagtg 
gttccgggcc tagaaccgca tgatgttctg gaaaaaattc 
ggctttgata aggtagaatt atactatacc ttgggagaga 
agcgcaccag ccattctcaa tgtgatcgag ttggccaaga 
tcagtcttgc cgacgacagc ggggacagga cctatgcata 
gtaccaatgg ttgcattcgg tctaggaaat gccaatagcc 
aatgtgcgaa tcgctgatta ttacacccat atcgaattag 



agattgaaaa atttgaaaag 60 

tgatttctaa gaaatcagtc 12 0 

tgggtgagat tttcaagcgt 180 

cctttgtcat ggcacatttc 240 

accactatga cactgtgcca 300 

tttcggtccg caatggcttc 360 

cagctcgctt gagtgctttg 42 0 

tcagctttat catggaggga 480 

aaaagcatgc agacaaactc 540 

atgccttgga acagctggaa 600 

aggtaaaaag cgctgatgtg 660 

cttggtatct cctccaagcc 720 

aaggcttgta cgaagaagta 780 

atggtcaacg aaacccagag 840 

aggaggagcg gatggccttt 900 

gaatccagtc tggttatcaa 960 

ccaagctaga ggttcgtctg 1020 

ggaaacagct agacaaaaat 1080 

tgagctatcg aagcgatatg 1140 

aattctatcc acagggcgtt 1200 

cggtctttga tgccctagag 1260 

gagaccacgg tggagatgaa 1320 

tagaggagct gattagaagc 1380 
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tatgagtag 1389 

<210> 5 

<211> 624 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 5 



gggaatatca 


tgggtagatt 


tttagacttt gtctttaatc 


gtttcttttt 


agggatgatt 


60 


gcgacagcct 


tcttttggct 


attaacttta gcaggaggga 


ttatccttgg 


tctagcgccg 


120 


gctagtgcca 


ccttgatgag 


cttatatgca gaacatggtt 


atagctttcg ggaatacagt 


180 


ttgaaggagg 


cttggtctct 


ttacaagcaa aattttgtct 


caagcaacct gattttctat 


240 


agctttttag 


gtgtgggtct 


agttttgacc tatggtttgt 


atctcttggt 


gcaattgcct 


300 


catcagacca 


ttgttcattt 


gattgcgacc cttttgaatg 


tcctagtagt 


tgccctgatc 


360 


tttttggctt 


atacagtatc 


tttaaaatta caagtttatt 


ttgccttgtc 


ctatcgaaat 


420 


agtctcaaat 


tatccttgat 


tggcatcttt atgagtctag 


cagctgtggc 


taaggttctc 


480 


cttgggactg 


tgctacttgt 


agcaattggt tattatatgc 


ctgccctgct 


attttttgta 


540 


ggaattggga 


tgtggcattt 


ctttatcagt gatatgttgg 


aacctgtcta 


tgaaatcatc 


600 


catgaaaaat 


tggcgacaaa 


atag 






624 



<210> 6 
<211> 630 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 6 

actcttgcca accaatttta tccaaatttc ccaatcagaa atcatcaata tcgattccat 60 

ctctcacctc aagctcacgc caaacggtct ggtagaaatt ttcttgaaaa acgaaagctt 120 

cacctactct tcacgccgtt atctaaaaac catcaaggag aaattagaac tatgaaaaaa 180 

caagtatttc acgatgcagc taccggtgtt cttatcggcc tcatcctctc tatcctcttt 240 

tcactcattt atgcaccaaa tacctacgca ccactaaatc cctactctct cataggccaa 300 

gtgatggatc agcatcaggt tcacggtgcc ctggtcttgc tctactgcac acttatctgg 3 60 

gcaaccatcg gtatgctctt caactttggc aaccgcttat ttagccgtga ctggagcatg 42 0 

cttcgtgcca ctctgactca tttcttcctt atgctggctg gctttgtccc actagcaact 480 

cttgctggtt ggttcccttt ccactggatt ttctacctcc agctcattat cgagtttgcg 540 

A- 
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attgtctatc tcatcatctg ggctattctc tataaaagag aggctaaaaa agtagatcac 600 
atcaatcaac tcttggagca tagaaaatag 630 

<210> 7 

<211> 609 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 7 



gagagaagga 


atactatgta cgcatattta 


aaaggaatca 


ttaccaaaat 


tactgccaaa 


60 


tacattgttc 


ttgaaaccaa tggtattggt 


tatatcctgc 


atgtggccaa 


tccttatgcc 


120 


tattcaggtc 


aggttaatca ggaggctcag 


atttatgtgc 


atcaggttgt 


gcgtgaggac 


180 


gcccatttgc 


tttatggatt tcgctcagag 


gatgagaaaa 


agctctttct 


tagtctgatt 


240 


tcggtctctg 


ggattggtcc tgtatcagct 


cttgctatta 


tcgctgctga 


tgacaatgct 


300 


ggcttggttc 


aagccattga aaccaagaac 


atcacctact 


tgaccaagtt 


ccctaaaatt 


360 


ggcaagaaaa 


cagcccagca gatggtgctg 


gacttggaag 


gcaaggtagt 


agttgcagga 


420 


gatgaccttc 


ctgccaaggt cgcagtgcaa 


gcaagtgctg 


aaaaccaaga 


attggaagaa 


480 


gctatggaag 


ccatgttggc tctgggctac 


aaggcaacag 


agctcaagaa 


aatcaagaaa 


540 


ttctttgaag 


gaacgacaga tacagctgag 


aactatatca 


agtcggccct 


taaaatgttg 


600 


gtcaaatag 










609 



<210> 8 

<211> 675 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 8 



tgtagaaaat gcagaagcac 


gtttgcgtgc 


agctctataa 


acatcaaggc 


tgggagcact 


60 


tcccagtctt attctatttt 


aatttcaaaa 


agaaagaaga 


aagaaatgaa 


aaaaatagtt 


120 


cttgttagtc tagctttcct 


ttttgtcctg 


gttggttgcg 


gacagaaaaa 


agaaactgga 


180 


ccagctacaa aaacagaaaa agatacgctt 


cagtcggcat 


tgccagttat 


tgaaaatgct 


240 


gagaagaata cagttgtaac 


taagactttg 


gtcttgccca 


agtcagatga 


tggtagccag 


300 


cagacacaaa caattactta 


caaagacaag 


acttttttga 


gtctagctat 


ccaacaaaaa 


360 


cgtccagtct ctgatgagtt 


gaagacttat 


attgaccaac 


atggagtgga 


ggaaactcaa 


420 
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aaagctcttc 


ttgaagcgga 


ggagaaggat 


aagtctatca 


ttgaagctcg 


taaattggca 


480 


ggtttcaaac 


ttgaaacaaa actattgagc 


gcaacggaac 


ttcaaacaac 


gactagtttt 


540 


gattttcaag 


ttcfcggatgt 


caagaaggct 


tcccagttgg 


aacatctgaa 


gaatattggt 


600 


ttggaaaatc 


ttttgaaaaa 


tgaaccaagc 


aaatatattt 


cagatagatt 


ggcaaatggc 


660 


gcgacagaac 


aatag 










675 



<210> 9 

<211> 555 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 9 



gcagataaat 


tgactccatt 


ttttgaactt 


gttatactag 


gggaattgct 


ggttagagaa 


60 


aatttctcta 


aattggtagc 


agaaaggaaa 


ttcatcatga 


aattaaaaag 


attcacactt 


12 0 


tctcttgctt 


ctctagcaag 


ttttagtctc 


ttagtagctt 


gttcacaaag 


agctcaacag 


180 


gttcaacagc 


ctgttgctca 


gcagcaggtc 


caacaacctg 


ctcaacagaa 


taccaatact 


240 


gcaaatgcag 


gaggtaacca 


aaatcaagcg 


gctccagtac 


aaaaccaacc 


tgttgctcaa 


300 


ccgaccgata 


ttgatgggac 


ttatactggt 


caggatgacg 


gagaccgtat 


cactttagtg 


360 


gtaactggaa 


cgactggtac 


atggactgag 


ctcgaatctg 


acggggatca 


gaaagtcaaa 


42 0 


caggttacat 


tggattcagc 


aaatcaacgc 


atgattattg 


gcgatgatgt 


caaaatttac 


480 


actgtaaacg 


gtaatcaaat 


cgtcgtagat 


gatatggata 


gagacccatc 


ggaccaaatc 


540 


gttttaacta 


aataa 










555 



<210> 10 

<211> 1557 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 10 



cattcaaact atcaaggagg 


ggatatgaaa 


tataggaaat 


ttcaattatt 


gatgtccaag 


60 


tatggcttta gtctttcgat 


tatgctactt 


gaactttgtc 


ttgtttttgg 


tctctttctt 


120 


tatttaggac gcatggctcc 


cattttatgg 


attactgtcc 


tcattctact 


gagtatcatc 


180 


acaatcattt cgatagtcaa 


ccgtaatacg 


actcctgaga 


ataaggtaac 


ctggttgtta 


240 


gtagcctttg tgccagtatt 


tggtcccttg 


ctctatctga 


tgtttggtga 


aaggcgattg 


300 


tccaaaaaag aaatcaaaca actgaagaag 


ctaggctcta 


tgcatttcca 


agaagcaaat 


360 
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agccagctac 


taaaagagaa 


attaaaagaa 


agtgacaagg 


cagcttatgg 


agtcatcaag 


420 


tccttattga 


gtatggatac 


caatgctgac 


atctatgatc 


aaactgcctc 


tacatttttt 


480 


cctaacggag 


aagctatgtg 


gaaaaagatg 


gtagaagatc 


ttaaaaaggc 


tgagaaattt 


540 


attttcttgg 


aatattacat 


tatagaagaa 


ggtttgatgt 


ggaatcgcat 


actagatata 


600 


ctagagcaaa 


aggtagctca 


gggtgtagag 


gttaagatgc 


tctatgatga 


tatcggctgt 


660 


atggctactt 


taacaggaga 


ttatgcacat 


cgacttcgtc 


agctgggcat 


cgaggcccat 


720 


aaattcaata 


aagttattcc 


tcgtttgaca 


gtggcttata 


ataacagaga 


tcatagaaaa 


780 


atattgattg 


ttgatggtca 


gatagcctat 


actggtgggg 


tcaatctggc 


agatgagtac 


840 


attaaccacg 


tcgagagatt 


tggttattgg 


aaggatagtg 


gaattcgctt 


agacggacta 


900 


gcagtaaaag 


ctctgacacg 


cttatttttg 


accacttggt 


acattaatcg 


aggagaaatt 


960 


agtgattttg 


atcaatatca 


tttagaaaat 


cattctatcc 


cgagtgacgg 


tttaaccatt 


1020 


ccatacggaa 


gtggacccaa 


gccaattttt 


cgagcgcagg 


tagggaaaaa 


agtttatcag 


1080 


agtttaatca 


atcaagcaac 


agaatcggtc 


tatattacga 


caccttattt 


gattatagat 


1140 


tatgatttaa 


cagagacaat 


caaaaatgca 


gctatgagag 


gggtcgatgt 


tcgaattatc 


1200 


accccttaca 


taccagataa 


gaagttcatt 


cagttagtca 


cgagaggagc 


ttatcccgac 


1260 


cttctttctg 


ctggtgttcg 


gatttatgag 


tatagtccag 


gttttattca 


tagtaagcag 


1320 


atgttggtag 


acgaagattt 


tgcggtggtg 


gggacaatca 


atctcgacta 


ccggagcttg 


1380 


gtacaccatt 


atgaaaatgc 


agtcttactc 


tataaaactc 


cttctataag 


ggaaatcgcc 


1440 


cgagattttc 


gaaatatatt 


tgcagattct 


caggaagtct 


atcctcattc 


tatcaaaacg 


1500 


agctggtatc 


aaaagcttgt 


aaaagaaatc 


gcccagctat 


tcgcccctat 


cttataa 


1557 



<210> 11 

<211> 282 

<212> DNA 

<213> Streptococcus pneumoniae 



<400> 11 



gaagacatca ttgatatctt 


gattaccttc 


gatgtcatga 


accaaacctt 


tggcacggta 


60 


gtgagcaatg attggttctc 


cttgagcaat 


attaacatcc 


aaacgacgtt 


ttactgtctc 


120 


aggcttatca tcttcacgtt 


ggtagtaatc 


ttcttcttta 


tagtcaactg 


gtgggttaaa 


180 


gaccttgtgg aaagtttctc 


cagttacgcg 


gtggatgata 


cgcccactca 


aacgttccaa 


240 
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aaggctgtca gggttcactt caatattgat aacaccttct ag 282 

<210> 12 

<211> 1473 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 12 



atgataaatg 


atataattct 


attattgttc 


gtaaaaat ta 


aaaggaga 1 1 


gatgatggac 








gaacggtaca 






cgctggttta 




acaactttct 


ttgcaatgag 


ctatattctc 


tttgtaaacc 


cacaaatact 


ttcacaaaca 




ggaatgcctg 


ctcagggcgt 


cttcctagcg 


acgattat tg 


gtgcagtagc 


gggtaccttg 




atgatggctt 


tttatgctaa 


cttaccttat 


gc ccaagcgc 


caggtatggg 


actcaatgcc 




ttctttacct 


ttacagttgt 


attcgggctt 


ggttattctt 


ggcaagaagc 


cctagctatg 




gtcttcatct 


gtgggattat 


ttcattgatt 


attaccttga 


caaatgttcg 


taaaatgatc 




attgaatcga 


ttcccaatgc 


tcttcgctca 


gctatttcag 


ctggtatcgg 


tgtcttcctt 




gcctatgtag 


ggafctaagaa 


tgctggactfc 


fctgaaattca 


cgattgatcc 


aggcaactat 




actgttgtag 


gagaaggggc 


fcgacaaagct 


caagcaacga 


ttgcagcaaa 


ctcttcagca 




gttccaggat 


tggtcagctt 


taataafccca 


gcfccrfctttag 


tggctcttgc 


aggacttgcc 




attactatct 


tctttgtcat 


caaagggat t 


aaagggggaa 


ttattctctc 


tatcttgaca 




acaactgttc 


ttgctattgc 


agttggtttg 


gttgatttgt 


ctagtatcga 


ttttgctaat 




aaccatgttg 


gtgcagcttt 


tgaagatttg 


aagacaatct 


ttggtgcagc 


tcttggttca 


840 


gaaggattgg 


gagctttggt 


ttcagataca 


gctcgcttgc 


ctgaaactct 


gatggccatt 


900 


cttgccttct 


cattgacaga 


tatttttgac 


acaattggta 


ccttgatcgg 


tacaggtgaa 


960 


aaagttggta 


tcgtagcgac 


aaatggtgaa 


aatcaccaat 


cagccaaatt 


ggataaggct 


1020 


ctttactctg 


atttgattgg 


aacgacagtc 


ggtgccattg 


caggtacttc 


aaacgtaacg 


1080 


acttatgttg 


agtctgctgc 


tggtatcggt 


gcaggtggac 


gtactggttt 


gacagccttg 


1140 


gttgtagcta 


tctgttttgc 


gatttcaagc 


ttctttagcc 


cacttctagc 


gatcgtacca 


1200 


acagcggcta 


cagctccaat 


cttgattatc 


gttgggatta 


tgatgcttgg 


tagcttgaaa 


1260 


aatatccatt 


gggatgatat 


gtctgaagca 


gttcctgcct 


tcttcacatc 


tatctttatg 


1320 


ggattcagct 


actctatcac 


tcaagggatt 


gcagttggtt 


tcttgactta 


cactttgact 


1380 
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aagcttgtta aaggtcaagt taaagatgtt catgtcatga tttggatttt ggatgccttg 1440 
tttatcctta actacatcag catggcctta taa 1473 

<210> 13 

<211> 3240 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 13 



ttcctattga 


ttttacaata 


tgtttattgg 


agtgtataca 


tgcaaacaaa aacaaagaag 




ctcattgtga 


gtttgtcttc 


acttgtttta 


tcaggatttt 


tattaaacca ttatatgaca 




attggagcgg 


aagaaacgac 


tacgaatacc 


attcagcaaa gccagaagga agttcagtat 




cagcaaaggg 


atacaaaaaa 


tttagttgaa 


aatggtgatt 


ttggtcagac ggaggacgga 


240 


agcagtccgt 


ggacaggaag 


caaagctcag 


gggtggtcag cttgggtaga ccagaagaat 


300 


agtgcagatg 


cctcaactcg 


agtcattgag 


gctaaggatg 


gggctatcac tatctcaagc 




catgagaaat 


taagggcagc 


gcttcaccgt 


atggttccta 


ttgaagctaa gaaaaagtat 




aaactgcgtt 


tcaagattaa 


aacagataat 


aaaatcggga 


ttgccaaagt tcgtatcatt 


480 


gaggaaagtg 


gtaaggacaa 


gcgattgtgg 


aattctgcaa 


cgacgtcagg aacaaaggac 




tggcagacca 


ttgaagcaga 


ctatagcccg 


actttagatg 


ttgataaaat caagctggag 




ttattctatg 


aaacaggaac 


tgggactgtt 


tcctttaagg 


atattgagct ggtagaggta 


660 


gcagaccagc 


tttctgagga 


ttctcaaaca 


gataaacagc 


ttgaggaaaa gattgattta 


720 


ccaattggaa 


aaaaacatgt 


tttttctctt 


gcggactata cttataaggt agaaaatcct 


780 


gacgttgctt 


cagtcaaaaa 


tggaatttta 


gaacctctta aggaagggac aaccaatgtc 


840 


attgtcagta 


aagatggcaa 


ggaagtgaaa 


aagattcctt 


tgaagattct ggcctctgtt 


900 


aaggatgcat 


acacagaccg 


tttggatgac 


tggaatggca 


tcatcgctgg gaatcaatac 


960 


tatgattcta 


aaaatgaaca 


gatggccaaa 


ttaaaccagg aattggaagg aaaggtagct 


1020 


gatagcctat 


ccagtatttc 


aagtcaggcg 


gaccgcacct 


atttgtggga aaaattttca 


1080 


aattataaaa 


cgtctgcaaa 


tctgactgcc 


acttatcgga 


aattggagga gatggccaag 


1140 


caagtgacca 


atccttcttc 


tcgttattat 


caagatgaaa 


ctgtcgttcg aacagtcagg 


1200 


gattccatgg 


aatggatgca 


taaacatgtc 


tacaatagtg 


aaaagagcat tgttgggaac 


1260 


tggtgggatt 


atgaaatcgg 


tacacctcgt 


gccatcaaca 


ataccttgtc tctgatgaaa 


1320 
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gaatacttct 


ctgatgagga 


aattaaaaaa 


tatacagatg 


tgattgaaaa 


atttgtacca 


1380 


gatcccgaac 


atttccgaaa 


gacgactgat 


aacccattca 


aggctctagg 


tggaaactta 


1440 


gttgatatgg 


gaagggtaaa 


agtaatagct 


ggtttactgc 


gtaaggatga 


tcaagaaatt 


1500 


tcttctacca 


ttcgctcgat 


tgagcaagtg 


ttcaagttgg 


tagaccaagg 


tgaaggtttt 


1560 


tatcaagatg 


gatcctatat 


cgaccacacc 


aatgttgcct 


atacgggtgc 


ttatgggaat 


1620 


gttttgattg 


atggcctgtc 


tcaactgttg 


ccagtcattc 


aaaagaccaa 


gaatccaatc 


1680 


gataaagata 


aaatgcaaac 


catgtaccac 


tggattgata 


aatcgtttgc 


tcctttgctg 


1740 


gtgaatggag 


agttgatgga 


tatgagtcgt 


ggacgctcga 


tcagtcgtgc 


aaatagcgag 


1800 


gggcacgtgg 


ccgcagtaga 


agtactaaga 


gggattcacc 


gaatagcgga 


tatgtctgaa 


1860 


ggagaaacca 


aacaatgttt 


gcagagtctt 


gtgaagacca 


ttgttcaatc 


ggatagttat 


1920 


tatgatgtct 


ttaagaattt 


gaagacttat 


aaggatatca 


gtttgatgca 


atccttgtta 


1980 


agtgatgcag 


gagtcgcaag 


tgttccaaga 


ccaagttacc 


tatctgcctt 


taacaagatg 


2040 


gataaaacag 


ccatgtacaa 


tgcagagaaa 


gggtttggat 


ttggcttgtc 


actcttttcc 


2100 


agtcgtacct 


tgaattacga 


acacatgaac 


aaggaaaata 


aacgtggttg 


gtatacgagt 


2160 


gatgggatgt 


tctatcttta 


caatggcgat 


ttgagtcact 


atagcgatgg 


ctactggcca 


2220 


acagttaatc 


catataagat 


gcctggtaca 


acagagacgg 


atgctaagag 


agcggatagc 


2280 


gatacaggta 


aagttttacc 


gtctgctttc 


gttggaacga 


gcaaactaga 


tgatgccaat 


2340 


gcgacagcaa 


ccatggattt 


caccaactgg 


aatcaaacat 


tgactgctca 


taagagctgg 


2400 


tttatgctaa 


aggataagat 


cgccttttta 


ggaagcaata 


tccaaaacac 


ttcaacagat 


2460 


actgctgcaa 


ctacaattga 


ccagagaaaa 


ctggaatcag 


gtaatccata 


taaagtctat 


2520 


gtcaatgata 


aagaagcctc 


ccttacagaa 


caagaaaagg 


attatcctga 


aacccaaagt 


2580 


gtctttttag 


aatcgttcga 


ttcgaaaaag 


aatattggtt 


actttttctt 


taagaagagt 


2640 


tcaatcagta 


tgagtaaggc 


tttgcaaaag 


ggagcctgga 


aggatatcaa 


tgaaggacag 


2700 


tcagacaagg 


aagttgaaaa 


tgaatttctt 


acgattagtc 


aggctcataa 


gcaaaataga 


2760 


gattcttatg 


gctatatgct 


cattcctaac 


gtggatcgtg 


ccaccttcaa 


tcaaatgata 


2820 


aaagagttag 


aaagtagcct 


catcgaaaat 


aacgaaaccc 


ttcagtctgt 


ttatgatgct 


2880 


aaacaaggag 


tttggggcat 


tgtgaaatat 


gatgattctg 


tctctactat 


ttccaaccaa 


2940 
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ttccaagttt 


tgaaacgtgg agtctatacc 


attcgaaaag 


aaggggatga 


atataagatt 




gcctactata 


atcctgaaac ccaggaatca 


gctccagatc 


aggaagtctt 


taaaaagcta 


3060 


gagcaagcag 


ctcagccaca agtacagaat 


tcaaaagaaa 


aggaaaaatc 


tgaagaggaa 


3120 


aagaaccatt 


cggatcaaaa gaatctccct 


cagacaggag 


aaggtcagtc 


aatcttggca 


3180 


agtctagggt 


tcttgctact tggggcattt 


tatctattcc 


gtagaggaaa 


gaacaactaa 


3240 



<210> 14 

<211> 831 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 14 



tggagctgtt 


caagtcaaca 


ttatggacta 


tatttaaaag 


aggagatcgt 


tatgtcgatt 


60 


aatgtatttc 


aagcgatttt 


aattggatta 


tggacagctt 


tctgttttag 


tggaatgctg 


120 


ttaggaattt 


acaccaatag 


atgtattgtt 


ctgtcatttg 


gtgtcggaat 


tattctaggt 


180 


gatctgccta 


ctgctcttgc 


aatgggagct 


attggtgaat 


tggcttatat 


gggattcggt 


240 


gttggtgctg 


gaggtactgt 


tccaccaaac 


ccaatcggac 


ctggtatctt 


tggtaccttg 


300 


atggctatca 


ctagtgctgg 


taaagtcagt 


ccagaagcgg 


ctcttgccct 


ctctactccg 


360 


attgctgtgg 


cgattcaatt 


cttacaaact 


ttcgcctaca 


ctgtacgtgc 


tggtgcgcct 


420 


gaaacagcta 


tgaagcactt 


gaaaaaccat 


aatttgaaga 


aatttaagtt 


cactctaaat 


480 


gcaacaattt 


ggttgtttgc 


ctttattgga 


tttaccttgg 


gttgcttggg 


tgccctttca 


540 


atggatacct 


tgttgaaact 


cgtagactac 


attccaccgg 


tattacttac 


aggtttgaca 


600 


gttgctggta 


aaatgctccc 


agctatcggt 


tttgcgatga 


tcttgtcagt 


gatggctaag 


660 


aaagagttga 


ttccctttgt 


cttgttggga 


tatgtttgtg 


cagcttatct 


aaacatccca 


720 


acaattggta 


ttgcaattgt 


aggtactatc 


tttgctttga 


ttgaatttta 


taacaagcca 


780 


aaaacagcgg 


atcatgtggt 


agaggaggaa 


gcacacgatg 


actggatcta 


a 


831 



<210> 15 
<211> 399 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 15 

tacatattgt cgactcactt cgtattgcaa gagctaaaaa agaccaggat taggaggtgc 60 
cttatgaaat cactagctag actactgatc attcatgttt ttatcagtat tttccttttc 120 
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ttcgccctta 


cttcaggagc 


tatttctcat 


acagttttac 


tactcctact 


cctctttctt 


180 


cctgcgctca 


ataaaggact 


tgagaaaata 


caatcaaaac 


ggatacctgt 


cctcaacgca 


240 


gccctcttct 


ttctcctcat 


atcctttcca 


caacttttaa 


ccaaccctgt 


ccaatggaaa 


300 


ttttcaatat 


tcctagtcgt 


aaccatcatt 


tcaagtttgg 


cctacttcta 


taacttttat 


360 


caagtagtta aagaagtaga 


tcaaaaacag 


ttgatttag 






399 



<210> 16 

<211> 2256 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 16 



gatatgaagt 


ggacaaaaag 


agtaatccgt 


ta tgcgac ca 


aaaatcggaa 


atcgccggct 




gaaaacagac 


gcagagttgg 


aaaaagtctg 


agut cautat 


n 1- rr t- r* +- 1- 1- rr 1- 














gctttggaac 


agatttagcg 




aaggaagcta 


agaaggttca 


tcaaaccacc 


cgtacagttc 


ctgccaaacg 


tgggactatt 




tatgaccgaa 


atggagtccc 


gattgctgag 


gatgcaacct 


cctataatgt 


ctatgcggtc 




attgatgaga 


actataagtc 


agcaacgggt 


aagattcttt 


acgtagaaaa 


aacacaattt 


360 


aacaaggttg 


cagaggtctt 


tcataagtat 


ctggacatgg 


aagaatccta 


tgtaagagag 


420 


caactctcgc 


aacctaatct 


caagcaagtt 


tcctttggag 


caaagggaaa 


tgggattacc 


480 


tatgccaata 


tgatgtctat 


caaaaaagaa 


ttggaagctg 


cagaggtcaa 


ggggattgat 


540 


tttacaacca 


gtcccaatcg 


tagttaccca 


aacggacaat 


ttgcttctag 


ttttatcggt 


600 


ctagctcagc 


tccatgaaaa 


tgaagatgga 


agcaagagct 


tgctgggaac 


ctctggaatg 


660 


gagagttcct 


tgaacagtat 


tcttgcaggg 


acagacggca 


ttattaccta 


tgaaaaggat 


720 


cgtctgggta 


atattgtacc 


cggaacagaa 


caagtttccc 


aacgaacgat 


ggacggtaag 


780 


gatgtttata 


caaccatttc 


cagccccctc 


cagtccttta 


tggaaaccca 


gatggatgct 


840 


tttcaagaga 


aggtaaaagg 


aaagtacatg 


acagcgactt 


tggtcagtgc 


taaaacaggg 


900 


gaaattctgg 


caacaacgca 


acgaccgacc 


tttgatgcag 


atacaaaaga 


aggcattaca 


960 


gaggactttg 


tttggcgtga 


tatcctttac 


caaagtaact 


atgagccagg 


ttccactatg 


1020 


aaagtgatga 


tgttggctgc 


tgctattgat 


aataatacct 


ttccaggagg 


agaagtcttt 


1080 


aatagtagtg 


agttaaaaat 


tgcagatgcc 


acgattcgag 


attgggacgt 


taatgaagga 


1140 
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ttgactggtg 


gcagaacgat 


gactttttct 


caaggttttg cacactcaag taacgttggg 


1200 


atgaccctcc 


ttgagcaaaa 


gatgggagat 


gctacctggc ttgattatct taatcgtttt 


1260 


aaatttggag 


ttccgacccg 


tttcggtttg 


acggatgagt atgctggtca gcttcctgcg 


1320 


gataatattg 


tcaacathgc 


gcaaagctca 


tttggacaag ggatttcagt gacccagacg 


1380 


caaatgattc 


gtgcctttac 


agctattgct 


aatgacggtg tcatgctgga gcctaaattt 


1440 


attagtgcca 


tttatgatcc 


aaatgatcaa 


actgctcgga aatctcaaaa agaaattgtg 


1500 


ggaaatcctg 


tttctaaaga 


tgcagctagt 


ctaactcgga ctaacatggt tttggtaggg 


1560 


acggatccgg 


tttatggaac 


catgtataac 


cacagcacag gcaagccaac tgtaactgtt 


1620 


cctgggcaaa 


atgtagccct 


caagtctggt 


acggctcaga ttgctgacga gaaaaatggt 


1680 


ggttatctag 


tcgggttaac 


cgactatatt 


ttctcggctg tatcgatgag tccggctgaa 


1740 


aatcctgatt 


ttatcttgta 


tgtgacggtc 


caacaacctg aacattattc aggtattcag 


1800 


ttgggagaat 


ttgccaatcc 


tatcttggag 


cgggcttcag ctatgaaaga ctctctcaat 


1860 


cttcaaacaa 


cagctaaggc 


tttagagcaa 


gtaagtcaac aaagtcctta tcctatgcct 


1920 


agtgtcaagg 


atatttcacc 


tggngattta 


gcagaagaat tgcgtcgcaa tcttgtacaa 


1980 


cccatcgttg 


tgggaacagg 


aacgaagatt 


aaaaacagtt ctgctgaaga agggaagaat 


2040 


cttgccccga 


accagcaagt 


ccttatctta 


tctgataaag cagaggaggt tccagatatg 


2100 


tatggttgga 


caaaggagac 


tgctgagacc 


cttgctaagt ggctcaatat agaacttgaa 


2160 


tttcaaggtt 


cgggctctac 


tgtgcagaag 


caagatgttc gtgctaacac agctatcaag 


2220 


gacattaaaa 


aaattacatt 


aactttagga 


gactaa 


2256 



<210> 17 
<211> 660 
<212> DWA 

<213> Streptococcus pneumoniae 
<400> 17 

tttaatttgt caaatggaaa tagaatgaaa aatggaaata gaatttatag ttggaggttg 60 
tttatgtacg gtataataaa acgattaggt gatatattat tatctttaat agggataata 120 
atattgtgtc cggtttttat gataattgca attgcgatta aacttgattc agaaggtccg 180 
gttatattta agcaaaaacg ctttggtatt cataaagaat acttctatat tttgaaattt 240 
aggtctatga aaatagatgc acctaaaaat gtggcgcctc gaaacttata taatccagag 3 00 
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caatggatta caaaagtagg ggctttcttg 


cgaaaaacat 


ctttggatga 


actaccacaa 


360 


ttgtttaata 


ttcttgttgg taatatgagt 


attgtaggtc 


ctagaccagc 


gggtataaat 


420 


gaactagatt 


tgattgcaga gagagataag 


tatggagcaa 


atgatatctt 


gccagggtta 


480 


actggatggg 


cacaaattaa cgggcgtgat 


actttgtctg 


ttgagatgaa 


gacggagtta 


540 


gatggctact 


atgttaaaca tctgtctttg 


ataatggata 


ttagatgtat 


agttaagaca 


600 


ataccttacg 


tactgaaacg aaaaggtatt 


gtagagggta 


gtggtaagaa 


agaaagttaa 


660 



<210> 18 

<211> 1251 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 18 



gaaagaaagt 


taaattggac 


aatgaaaata 


ctatttgttt 


gccaacatta 


taagccagaa 










ttagttcgaa 


aagggcatga 


agtctctgtt 




ttggctggga 


ttcctaatta 


ccctgaaggg 


aagatatatg 


cagattatcg 


tcataataaa 




aaaagacgtg 


agattataga 


aggtgttacg 


atatatcgtt 


cttatacaat 


ccctagaaaa 


240 


aaaagtgttg 


tatttcgatt 


gttgaattat 


tttagctttg 


caattagttc 


tactttagga 


300 


gttttattgg 


ggaggtataa 


aacgaaagat 


ggatcgaatt 


ttgactgtgt 


attcgttaac 


360 


caattgtctc 


cagttatgat 


ggcatgggct 


ggtatggctt 


ataaaaaaaa 


atataagaaa 


420 


ccgatgtttc 


tatattgtat 


ggatgtttgg 


ccagatagtt 


taaccgtagg 


tggagtgaaa 


480 


caagatggct 


tgattttcaa 


gctgtttaaa 


tttatctcaa 


aaaaagttta 


ccgagctagt 


540 


gattatatat 


ttgtcactag 


tccatcattt 


aaaaattatt 


ttgtgaagca 


atttgacata 


600 


tccgaacaaa 


agattacata 


tttgccacaa 


tatgcagaag 


atctttttat 


ccctgatgaa 


660 


tctatagtta 


ataaagaaag 


tgttgaccta 


acttttgctg 


gtaatattgg 


caaagcacaa 


720 


aatttggaaa 


ctattttgaa 


agctgccagt 


ttgatagaga 


agaataccaa 


tttacccaag 


780 


aaaattcatt 


ttcattttgt 


tggagatggt 


acggaattgt 


taagcatgaa 


agcattagct 


840 


catgaattgg 


agttaaagaa 


tatttccttc 


tatggaagac 


gttctttgga 


ggaaatgcca 


900 


tccttctata 


aaaaatcaga 


tgctatgtta 


gtttctttaa 


taggagactc 


gatagtttct 


960 


cgtactatac 


ctgggaaggt 


acaatcttat 


atggcggcag 


gcaaaccaat 


tataggtgca 


1020 


atttcaggag 


atgctaaaat 


aattgtagaa 


gaagcaaatt 


gtggatatgt 


tagtcccgaa 


1080 
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cgagatgtaa aacaattggc aaaaaatatt tgtaaattta gtatgttatc tattaagaga 1140 
caaagagagt taggaaagaa agctcgttgt tactatgaaa atcacttttc aaaagagcag 1200 
tttatgctcg aactggagac atgtttagag agggaaagta agaaagaata a 1251 

<210> 19 

<211> 1128 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 19 











tagatagtgg tggagtagaa 




ag"tfc ttctat 








aaattcaatt tgattttatt 




cj'tcfccfccj'cra.a 


aagaacaagg 


atttttagag 


gataaaatga 


aagaattggg tgcaaaggtt 








gaaaaagcct 


ctacatcagt 


ttcfcctctct tgctagaata 




ataaacraaacf 


gagattatga 




tgccatggct 


ataaatctgc aattggtctg 




atct tafccta 








atagtcatat ggcttatgta 




acagaaaaca 


gttttcaaaa 


agtattgcgt 


aaattagtsa 


caattttggt aaaaatctta 




gcaactcatt 


ggtttgcatg 


tggggaagat 


tcggctaagt 


ggttatatgg agagaaagcg 


480 


tataaagacg 


gaaaaattga 


aattattttt 


aatgcaattg 


atttgaaaaa gtatcaattt 


540 


ttgtcagatg 


ttagagaaaa 


atgtcgtaga 


gaattagatg 


tgtcaaataa gttcgtatta 


600 


ggaaatatag 


ctcgcctatc 


agatcaaaaa 


aaccaaagtt 


atttatttaa cgttttaaaa 


660 


gaactcattt 


taatcaaacc 


aaatgttatt 


ttactcctag 


ttggtaatgg tgaggatgag 


720 


cagaaattaa 


aacagaaagc 


tttagaacta 


aatctgaccc 


catatgtgct atttttaggg 


780 


agaaggactg 


atatttctga 


tttattatct 


gcgatggatg 


tttttttgct tccgtctaaa 


840 


tatgaggggt 


tgcctgtttc 


tctagtagag 


gctcaggcat 


cgggattaca aattttatcg 


900 


tcagatacag 


tgacgcaaga 


agtagatgtg 


accaaaaaca 


ttagttactt acctatcaac 


960 


gaagagtctg 


tgttgctatg 


gaaagataaa 


gtactgtctt 


taacatctga ggaatgcaat 


1020 


cgttttgaaa 


taaataacag 


tatgacagat 


ggactctatg 


atatttgtta tcaagctagt 


1080 


aaattattga 


atcgttatca 


agaaatgtgt 


gtaataaagg 


agatatag 


1128 



<210> 20 
<211> 1245 
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<212> DNA 

<213> Streptococcus pneumoniae 
<400> 20 



ttcgtgaaag 


ttgatagaat 


ttcatttata 


aaaaatacaa 


gttctctcta tattctgaat 


60 








ctcccgtatt tgacaagggt gctttcgcta 




gacgcgtatg 


gaatggttat 


ttatgttaaa 


gcgttaatag cttatgttca actggtgatt 




gattttggtt 






aatattgtaa atgcttgtac tactccctca 




aagattggaa 


ggatagttgg 




gttgaaaaaa 


tatttttatc tatcatttcg 


300 


attctaattt 


acaccatatt 


gatgtggcaa 


atcccaataa 


tgagagagaa tattcttttt 




tcagtttttt 


atttgtfcagc 


tacagtgacc 


aatattttta tctttgactt tttatttcgt 


420 


ggaa 1 1 gaaa 


agatgcatgc 


agttgcaatt 


ccttatatta 


tttctaaaac tatcattaca 


480 


attttgacat 


ttattgtagt 


aaaagatgat 


tcttctattt 


tatggattcc tatattggaa 


540 


ggaa 1 1 ggga 


atttagttgc 


tgcagtagtt 


tcttatagat 


tccttcatta ttatggaatt 


600 


aaattatcat 


tttcttatct 


gtctgtttgg 


gttaaagatt 


taaaggaatc ctctatttat 


660 


tttttatcca 




tactattttt 


ggcgtcttta cgacagtcat ttcgggtttt 










gggatagcaa 


tgcaactgct ttcagcagca 




aaatcattgt 


ataatcctat 


agcgaatagt 


ttatatccgc 


atatgatacg tactaaagat 




atacaatcgg 


ttaagagtat 


taatcggatt 


atgtttattc 


ctattatctt tggagttttg 


900 


atagttttat 


tcttttcaaa 


tcaaattctt 


tctataattg 


gtggtgaaaa atataccgtt 


960 


tcagcagatt 


ttcttaagta 


cttattaccc 


gcttttgttg 


ctagttttta ttctatgatt 


1020 


tacggatggc 


ctgtcttagg 


agctattgat 


aaagtgaaag 


aaactacaat gacaactata 


1080 


ttagcttcga 


ttgtccaaac 


tttgggatta 


ggaatattta 


tcttgtctga taattttagt 


1140 


ttagtaacat 


tagctatttg 


ttcaagtatg 


tctgaggtgg 


tgttatggat tagccgttat 


1200 


ctaatttatt 


ttaagaaccg 


ttcattattt 


gttaggagta agtaa 


1245 



<210> 21 
<211> 5310 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 21 

aagtttatga ataaaggatt atttgaaaaa cgttgtaaat atagtattcg gaaattttca 60 
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ttaggtgttg 


cttctgttat 


gattggagct 


gcattctttg 


ggacaagtcc 


ggttcttgca 


120 


gatagcgtgc 


agtctggttc 


cacggcgaac 


ttaccagctcr 


atttagctac 


tgctcttgca 


180 


acagcaaaag 


agaa t gat gg 


gcgtgatttt 


gaagcgccta 


aggt QggsiQB. 


agaccaaggt 


240 




ttacagatgg 


acc taagaca 


gaagaagaac 


tat tagcac t 




300 




































caactatcct 






480 


























tggtgtcttt 






aagataccaa 


gaataafcgtt 


tttgtcggtt 




tggctggttc 


660 




aatctccaac 


aactagcact 


tggtatagag 


gtagtcgtgt 


tgctgctcct 


720 






tctctctatc 


























agaagaagat 


tcttctcaag 


gcgggctctt 


atccacgatga 


gcgaacagtt 
















































































acagtacgct 






















gatgacgaaa 


gcaaactact 


ttcttctatt 


agtttcctcg 


gcaatgcttt 


agtctctgtt 


1380 


tctagtaatc 


aaactggtgc 




g-Qg-gcaa.cc a 






1440 


agcggagatg 


atcatatcga 


tgtaaccaat 


ccaatgaagg 


atttggctaa 


gggttacatg 


1500 


tatggatttg 


tttctacaga 


taagcttgct 


gctggtgttt 


ggagtaactc 


tcaaaacagc 


1560 


tatggtggtg 


gttcgaatga 


ctggactcgt 


ttgacagctt 


ataaagaaac 


agtcggaaat 


1620 


gccaactatg 


taggaatcca 


cagctctgaa 


tggcaatggg 


aaaaagctta 


taagggcatt 


1680 


gttttcccag 


aatacacgaa 


ggaacttcca 


agtgctaagg 


ttgttatcac 


tgaagatgcc 


1740 
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aatgcagaca 


agaacgttga 


ttggcaagat 


ggtgccattg 


cttatcgtag 


cattatgaac 


1800 


aatcctcaag 


gttgggaaaa 


agttaaggat 


atcacagctt 


accgtatcgc 


gatgaacttt 


1860 


ggttctcaag 


cacaaaaccc 


attccttatg 


accttggatg 


gtatcaagaa 


aatcaatctc 


1920 


catacagatg 


gtcttgggca 


aggtgttctc 


cttaaaggat 


atggtagcga 


aggccatgac 


1980 


tctggtcact 


tgaactatgc 


tgatattggt 


aagcgtatcg 


gtggtgtcga 


agacttcaag 


2040 


accctaattg 


agaaggctaa 


gaaatatgga 


gctcatctag 


gtatccacgt 


taacgcttca 


2100 


gaaacttatc 


ctgagtctaa 


atacttcaat 


gaaaaaattc 


tccgtaagaa 


tccagatgga 


2160 


agctatagct 


atggttggaa 


ctggctagat 


caaggtatca 


acattgatgc 


tgcctatgac 


2220 


ctagctcatg 


gtcgtttggc 


acgttgggaa 


gatttgaaga 


aaaaacttgg 


tgacggtctc 


2280 


gactttatct 


atgtggacgt 


ttggggtaat 


ggtcaatcag 


gtgataacgg 


tgcctgggct 


2340 


acccacgttc 


ttgctaaaga 


aattaacaaa 


caaggctggc 


gctttgcgat 


cgagtggggc 


2400 


catggtggtg 


agtacgactc 


taccttccat 


cactgggcag 


ctgacttgac 


ctacggtggc 


2460 


tacaccaata 


aaggtatcaa 


cagtgccatc 


acccgcttta 


tccgtaacca 


ccaaaaagat 


2520 


gcttgggtag 


gggactacag 


aagttatggt 


ggtgcagcca 


actatccact 


gctaggtggc 


2580 


tacagcatga 


aagactttga 


aggctggcag 


ggaagaagtg 


actacaatgg 


ctatgtaacc 


2640 


aacttatttg 


cccatgacgt 


catgactaag 


tacttccaac 


acttcactgt 


aagtaaatgg 


2700 


gaaaatggta 


caccggtgac 


tatgaccgat 


aacggtagca 


cctataaatg 


gactccagaa 


2760 


atgcgagtgg 


aattggtaga 


tgctgacaat 


aataaagtag 


ttgtaactcg 


taagtcaaat 


2820 


gatgtcaata 


gtccacaata 


tcgcgaacgt 


acagtaacgc 


tcaacggacg 


tgtcatccaa 


2880 


gatggttcag 


cttacttgac 


tccttggaac 


tgggatgcaa 


atggtaagaa 


actttctact 


2940 


gataaggaaa 


agatgtacta 


cttcaatacg 


caggccggtg 


caacaacttg 


gacccttcca 


3000 


agcgattggg 


caaagagcaa 


ggtttacctt 


tacaagctaa 


ctgaccaagg 


taagacagaa 


3060 


gagcaagaac 


taactgtaaa 


agatggtaaa 


attaccctag 


atcttctagc 


aaatcaacca 


3120 


tacgttctct 


atcgttcgaa 


acaaactaat 


cctgaaatgt 


catggagtga 


aggcatgcac 


3180 


atctatgacc 


aaggatttaa 


tagcggtacc 


ttgaaacatt 


ggaccatttc 


aggcgatgct 


3240 


tctaaggcag 


aaattgtcaa 


gtctcaaggg 


gcaaacgata 


tgcttcgtat 


tcaaggaaac 


3300 


aaagaaaaag 


ttagtctcac 


tcagaaatta 


actggcttga 


aaccaaatac 


caagtatgcc 


3360 



-18- 



WO 02/083855 



PCT/US02/11524 



gtttatgttg 


gtgtagataa 


ccgtagtaat 


gccaaggcaa 


gtatcactgt 


gaatactggt 


3420 


gaaaaagaag 


tgactactta 


taccaataag 


tctctcgcgc 


tcaactatgt 


taaggcctac 


3480 


gcccacaata 


cacgtcgtga 


caatgctaca 


gttgacgata 


caagttactt 


ccaaaacatg 


3540 


tacgccttct 


ttacaactgg 


agcggacgtc 


tcaaatgtta 


cfcctgacatt 


gagtcgfcgaa 


3600 


gctggtgatc 


aagcaactta 


ctfctgafcgaa 


attcgtacct 


ttgaaaacaa 


ttcaagcatg 


3660 


tacggagaca 


agcatgatac 


aggfcaaaggc 


accttcaagc 


aagactttga 


aaatgttgcfc 


3720 


cagggtatct 


tcccatttgt 


agtgggtggt 


gtcgaaggtg 


ttgaagataa 


ccgcactcac 


3780 


ttgtctgaaa 


aacacaatcc 


atatacacaa 


cgtggttgga 


afcggtaagaa 


agfccgafcgat 


3840 


gtfcatcgaag 


gaaattggtc 


actcaagaca 


aatggactag 


tgagccgtcg 




3900 


taccaaacca 


tcccacaaaa 


cttccgtttt 


gaagcaggta 


agacctaccg 


tgtaaccttfc 


3960 


gaatacgaag 


caggatcaga 


caatacctat 


gcttttgtag 


tcggtaaggg 


agaattccag 


4020 


tcaggtcgtc 


gtggfcactca 


agcaagcaac 


fctggaaatgc 


atgaattgcc 




4080 


acagattcta 


agaaagccaa 


gaaggcaacc 


ttccttgtga 


caggtgcaga 




4140 


acttgggtag 


gtatctactc 


aactggaaat 


gcaagtaata 


ctcgtggtga 


ttctggtgga 


4200 


aatgccaact 


tccgtggtta 


taacgacttc 


atgatggata 


atcttcaaat 


cgaagaaatt 


4260 


accctaacag 


gtaagatgtt 


gacagaaaat 


gctctgaaga 


acfcacttgcc 


aacggfcfcgcc 


4320 


atgactaact 


acaccaaaga 


gtctatggat 


gctttgaaag 


aggcggtct t 


taacctcagt 




caggccgatg 


afcgatatcag 












t tgaagaatg 


ctttggttca 


gaagaagacg 


gcfcttggtag 


cagatgacfcfc 


fcgcaagfccfct 


4500 


acagctcctg 


ctcaggctca 


agaaggtctt 


gcaaatgcct 


ttgatggcaa 


tgtgtctagt 


4560 


ctsfcggcata 


catctfcggaa 


tggtggagat 


gtaggcaagc 


cfcgcaactat 


ggtcttgaaa 


4620 


gaaccaactg 


aaatcacagg 


acttcgctat 


gttccgcgtg 


gatcaggttc 


aaatggtaac 


4680 


ttgcgagatg 


tgaaacttgt 


tgtgacagat 


gagtctggca 


aggagcatac 


ctttactgca 


4740 


actgattggc 


caaataacaa 


caaaccaaaa 


gatattgact 


ttggtaagac 


aatcaaggct 


4800 


aagaaaattg 


tccttactgg 


taccaagaca 


tacggagatg 


gtggagataa 


ataccaatct 


4860 


gcagcggaac 


ttatctttac 


tcgtccacag 


gtagcagaaa 


cacctcttga 


cttgtcaggc 


4920 


tatgaagcag 


ctttggttaa 


ggctcagaaa 


ttaacagaca 


aagacaatca 


agaggaagta 


4980 


gctagcgttc 


aggcaagcat 


gaaatatgcg 


acggataacc 


atctcttgac 


ggaaagaatg 


5040 



-19- 



WO 02/083855 PCT/US02/11524 



gtggaatact ttgcagatta tctcaaccaa 


ttaaaagatt 


ctgctacgaa accagatgct 


5100 


ccaactgtag agaaacctga gtttaaactt 


agatctttag 


cttccgagca aggtaagacg 


5160 


ccagattata agcaagaaat 


agctagacca 


gaaacacctg 


aacaaatctt 


gccagcaaca 


5220 


ggtgagagtc 


aatctgacac 


agccctcatc 


ctagcaagtg 


ttagtctagc 


cctatctgct 


5280 


ctctttgtag taaaaacgaa 


gaaagactag 










<210> 22 
<211> 717 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 22 
aagggagagg 


atgaacctat 


gagaaaattt 


aaaatctttt 


tatttatcga 


agcctgtctt 


60 


ctgacaggag 


ctctgatttt 


gatggtatca 


gagcattttt 


cgcgttttct 


gctgatacta 


120 


ttcctctttt 


tgcttttgat 


tcgctactac 


actggtaaag 


agggaaataa 


tcttctttta 


180 


gtagcggcaa 


ccattctctt 


ctttttcatc 


gttatgctca 


atccttttgt 


gattctagct 


240 


atttttgttg 


cggttatcta 


tagcctcttt 


cttctttacc 


cgatgatgaa 


ccaggaaaaa 


300 


gagcagacca 


atttggtttt 


tgaagaggtc 


gtgacggtta 


agaaggagaa 


aaatcgttgg 


360 


tttggaaatc 


ttcatcattt 


ttcaagctac 


cagacttgcc 


aattcgatga 


tatcaatctc 


42 o: 


tttcgcttca 


tgggcaagga 


cactattcat 


ctggagaggg 


tcatcttaac 


caatcatgac 


480 


aatgtcatta 


tcctcagaaa 


gatggtagga 


acgaccaaaa 


tcatcgtacc 


tgtagatgtg 


540 


gaagtcagtc 


tcagcgttaa 


ctgtctctat 


ggggatttga 


tttttttcaa 


ccagcccaag 


600 


cgagccctcc 


gcaatgaaca 


ctatcatcaa 


gaaacaaaag 


actatctcaa 


gagtaacaag 


660 


agtgtcaaga 


ttttcttgac 


cactatgatt 


ggtgatgtgg 


aggtggttag aggatga 




<210> 23 
<211> 252 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 23 

gaggatacaa taatgaagaa 


aactgtttat 


aaaaaattgg 


gtatttcaat 


tattgcgagt 


60 


actttattgg ctagccagtt 


atcgacagta 


tctgctttga 


gtgttatttc 


tagtacaggt 


120 


gaagaatatg aggtaagtga gacactagaa 


aaaggtccag 


agtctaatga 


ttcttcatta 


180 
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tctgagattt caccaacgta tggttcatac taccaaaagc aatcagaagt attatcggta 240 
atgatgattt ga 252 

<210> 24 

<211> 2361 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 24 



acattcaaag 


acaaggaaat 


aaagatgaat 


aagaaaatat 


tagaaacatt 


agagttcgat 


60 


aaggtcaagg 


ccttgtttga 


gcctcatttg 


ttgaccgagc 


agggcttgga 


gcaattgaga 


120 


caactggctc 


cgactgccaa 


agcagataaa 


atcaaacagg 


cttttgctga 


gatgaaggaa 


180 


atgcaggctc 


ttttcgtcga 


gcaaccgcat 


tttactattc 


tctcaactaa 


ggaaattgca 


240 


ggagtctgca 


agaggttgga 


gatgggagcg 


gatctcaata 


tcgaggagtt 


cctactcttg 


300 


aaacgcgtgc 


ttcttgccag 


ccgagaactt 


caaaattttt 


acaccaatct 


ggaaaatgtc 


360 


agcttggaag 


aattagccct 


ttggtttgag 


aaattacatg 


attttccgca 


attacaagga 


420 


aatcttcagg 


cctttaatga 


tgcgggtttc 


attgaaaatt 


ttgccagtga 


agaattggcg 


480 


cgaatccgtc 


gaaaaataca 


tgatagcgag 


agtcaggtac 


gcgatgtttt 


acaagacttg 


540 


ctcaagcaaa 


aagcgcagct 


gttgacggaa 


ggaattgttg 


ctagcagaaa 


tggccgtcag 


600 


gttttaccag 


tcaaaaacac 


ctaccgcaat 


aagattgcag 


gtgtcgttca 


tgatatttct 


660 


gctagtggaa 


acaccgtcta 


tatcgaaccc 


cgtgaggtag 


tcaaactgag 


cgaagaaatt 


720 


gctagtctgc 


gagcagatga 


gcgctatgaa 


atgcttcgca 


ttctccaaga 


aatttctgag 


780 


cgtgtccgcc 


ctcatgcggc 


tgagattgct 


aatgacgctt 


ggattatcgg 


tcatctggac 


840 


ttgattcgtg 


ccaaggttcg 


atttatccaa 


gaaagacaag 


cagtcgtgcc 


tcagctgtca 


900 


gaaaatcaag 


agattcaact 


gctccatgtc 


tgccatcctt 


tggtcaaaaa 


tgccgtcgca 


960 


aatgatgtct 


attttggtca 


agatttaaca 


gctattgtca 


ttacaggtcc 


caatacaggt 


1020 


gggaagacca 


tcatgctcaa 


aactctgggc 


ttgacacagg 


tcatggccca 


gtcaggattg 


1080 


ccgattttag 


cagacaaggg 


aagtcgtgtt 


ggtatttttg 


aagaaatctt 


tgctgatatt 


1140 


ggagatgagc 


agtctattga 


gcagagcttg 


tctaccttct 


ctagtcatat 


gaccaatatc 


1200 


gtggatattc 


ttggcaaggt 


caaccaacat 


tcactcttac 


ttttggatga 


gttgggggct 


1260 


ggtactgatc 


cccaagaggg 


agcagccctt 


gccatggcta 


ttctggagga 


ccttcgcctg 


1320 
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cgtcaaatca 


agaccatggc 


gacgacccac tatccagaac tcaaggccta 


cggtattgag 


1380 


acagcctttg 


tgcaaaatgc 


cagtatggag tttgatactg caactcttcg 


cccgacctat 


1440 


cgctttafcgc 


agggtgttcc 


tggccgaagt aatgcctttg aaattgccaa 


acgtctaggc 


1500 


cfcafccfcgaag 


fctafccgfcagg 


agafcgccagfc cagcagafccg atcaggacaa 


tgacgtcaat 


1560 










1620 


cgtgaggtgg 










cttaatcgtg 




cgagcttaac aaggcgcgtg aacaggctgc 


tgagattgtg 


1740 






tgaccagatt ctcaaaaatc tccacagtaa 


atcccaactc 








agccaaggcc aagttgaaaa aattggctcc 


tgaaaaagtg 








ccttcaaaag gccaagaaaa aacgagctcc 


aaaggtggga 








ttatggtcag cgtggtacct tgaccagtca actcaaggac 




ggtcgctggg 




tggcttgatt aagatgacct tggaagagaa 


agagtttgat 


2040 






aaaaccagtc aagaagaaac aggtcaatgt 


tgtgaaacga 




acttctgggc 


gaggacctca 


agctagactg gatcttcgag gcaagcgcta 


tgaagaagcc 


2160 


atgaatgagc 


tagatacctt 


catcgaccaa gccttgctta acaatatggc 


tcaagttgat 


2220 


atcatccatg 


gtatcggaac 


aggagtcatc cgtgaaggag ttaccaaata cttgcaaaga 


2280 


aacaaacatg 


tcaagagttt 


cggctatgcc ccacaaaatg ctggaggcag 


tggtgcgact 


2340 


attgtcactt 


ttaaaggata 


g 




2361 



<210> 25 
<211> 294 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 25 

cagctgaggc acgactgctt gtctttcttg gataaatcga accttggcac gaatcaagtc 60 
cagatgaccg ataatccaag cgtcattagc aatctcagcc gcatgagggc ggacacgctc 12 0 
agaaatttct tggagaatgc gaagcatttc atagcgctca tctgctcgca gactagcaat 180 
ttcttcgctc agtttgacta cctcacgggg ttcgatatag acggtgtttc cactagcaga 240 
aatatcatga acgacacctg caatcttatt gcggtaggtg tttttgactg gtaa 294 

<210> 26 
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<211> 915 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 26 



ttattggagg 


ttaggatgaa 


aaaactcccc 


ttagtatttt caggttgttt gctaggtttg 


60 


gcaggagctg 


gaaatcttat 


tttagatacg 


ttgccggttc tatcccatct ttttagtctg 


120 


attggtttgg 


ttttatggat 


ttactttcta 


attctgcatc tctttaattg gaaagaaacc 


180 


aagcaagaat 


tgaccaagcc 


ccctcttttg 


tcaggaatgg caacctttcc tatggctggg 


240 


atgattttat 


cgacctatgt 


ctttcgcgta 


ttctcttatc ttcctttggt agcacaaggg 


300 


atttggtggt 


tttcatttct 


cttggatttg 


accttgattg ctggttttac catcaagttt 


360 


gcttgtccag 


ggcggagggt 


tcatgccact 


ccaagctgga cggttctcta tgtggggata 


420 


gcagtggctg 


ccttgaccta 


tcctctggta 


ggtattatcg aaattgccta tgcgaccttg 


480 


agttttggtt 


ttctcttgac 


cttctatctc 


tatcccctta tttatagcga tttaaagaaa 


540 


catccactcc 


cactagcctt 


gcttggacaa 


gaaggaatct actgtgctcc tttctctcta 


600 


ctcttggctt 


ctctagttcg 


agtaggagga 


accagcctgc cgacttgggt cttgattgtc 


660 


atgattttgg 


cttctcaatc 


cttctttttc 


tttgttttaa ctcgtctgcc caacatttta 


720 


aaacaaggtt 


ttcaaccagc 


cttctcagcc 


ctcaccttcc caaccattat cacagcgacc 


780 


tcgctcaaga 


tggctcaggg 


aattttgaaa 


cttccatttc tggattacct ggtattggct 


840 


gaaaccatta 


tatgcctaac 


tattttattc 


tttgtactag gtgcttatct gatttggtta 


900 


cgaaaaaagg 


tctag 






915 



<210> 27 
<211> 849 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 27 

tctatgtatc ttattgaaat tttaaaatct atcttcttcg gaattgttga aggaattacg 60 
gaatggttgc cgatttccag tacaggtcac ttgattttag cagaggaatt catccaatac 120 
caaaatcaaa atgaagcctt tatgtccatg tttaatgtcg tgattcagct tggtgctatt 180 
ttagcagtta tggtgattta ttttaacaag ctcaatcctt ttaaaccgac caaggacaaa 240 
caggaagttc gtaagacttg gagactatgg ttgaaggtct tgattgctac tttaccttta 300 
cttggtgtct ttaaatttga tgattggttt gatacccact tccataacat ggtttcagtt 360 
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gctctcatgt 


tgattatcta cggggttgcc 


ttcatctatt 


tggaaaagcg 


caataaagcg 


420 


cgtgctatcg 


agccaagtgt aacagagttg 


gacaagcttc 


cttatacgac 


cgctttctat 


480 


atcggactct 


tccaagttct tgctctttta 


ccagggacta 


gccgttcagg 


tgcaacgatt 


540 


gtcggtggtt 


tgttaaatgg aaccagtcgt 


tcagttgtga 


cagaatttac 


cttctatctt 


600 


gggattcctg 


ttatgtttgg agctagtgcc 


ttaaagattt 


tcaaatttgt 


gaaagccgga 


660 


gaactcttga 


gctttgggca attgtttttg 


ctcttggtcg 


cgatgggagt 


agcttttgcg 


720 


gtcagcatgg 


tggctattcg cttcttgacc 


agctatgtga 


aaaaacacga 


cttcaccctt 


780 


tttggtaaat 


accgtatcgt gcttggtagt 


gttttgctac 


tttacagttt 


tgtccgttta 


840 


tttgtataa 










849 



<210> 28 

<211> 939 

<212> DNA 

<213> Streptococcus pneumoniae 



<400> 28 



aatgatgagt 


ttgaagataa 


agggatgctg 


ataaaaatgg 


taaaaacaaa 


aaagcaaaaa. 


60 


cgaaataatc 


tcctattagg 


agtggtattt 


ttcattggaa 


tggcggtaat 


ggcgtatccg 


120 


ctggtgtctc 


gcttgtatta 


tcgagtggaa 


tcaaatcaac 


aaattgctga 


ctttgataag 


180 


gaaaaagcaa 


cgttggatga 


ggctgacatt 


gatgaacgaa 


tgaaattggc 


acaagccttc 


240 


aatgactctt 


tgaataatgt 


agtgagtggc 


gatccttggt 


cggaagaaat 


gaagaaaaaa 


300 


gggcgagcag 


agtatgcacg 


tatgttagaa 


atccatgagc 


ggatggggca 


tgtggaaatc 


360 


cccgttattg 


acgtggattt 


gccggtttat 


gctggtactg 


ctgaagaggt 


attgcagcaa 


420 


ggggctgggc 


atctagaggg 


aacttctctg 


ccgatcggag 


gcaattcgac 


ccatgcggtg 


480 


attacggcac 


atacaggttt 


gccaacagct 


aagatgttta 


cggatttgac 


caaacttaaa 


540 


gttggggata 


agttttatgt 


gcacaatatc 


aaggaagtga 


tggcctatca 


agtggatcaa 


600 


gtaaaggtga 


ttgagccgac 


gaactttgat 


gatttattga 


ttgtaccagg 


tcatgattat 


660 


gtgaccttgc 


tgacttgtac 


gccatacatg 


atcaataccc 


atcgtctatt 


ggttcggggg 


720 


catcggatac 


cgtacgtagc 


agaggttgag 


gaagaattta 


ttgcagcaaa 


caaactcagt 


780 


catctctatc 


gctacctgtt 


ttatgtggca 


gttggtttga 


ttgtgattct 


tttatggatt 


840 


attcgacgct 


tgcgcaagaa 


gaaaaaacaa 


ccggaaaagg 


ctttgaaggc 


gctgaaagca 


900 
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gcaaggaagg aagtgaaggt ggaggatgga caacagtag 939 

<210> 29 
<211> 903 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 29 

aggtggagga tggacaacag tagacgttca cgaaaaaaag gcacaaaaaa gaagaaacat 

ccgctgatcc ttcttctgat tttcttagta ggattcgccg ttgcgatata tccattggtg 1 

tctcgttatt attatcgtat tgagtcaaac gaggttatta aagagtttga tgagacggtt 1 

tcccagatgg ataaggcaga acttgaggag cgttggcgct tggctcaagc cttcaatgcg 2 

accttgaaac catctgaaat tcttgatcct tttacagagc aagagaaaaa gaaaggcgtc 3 

tcagaatatg ccaatatgct aaaggtccat gagcggattg gctatgtgga aattcctgcg 3 

attgatcagg aaattccgat gtatgtcgga acgagtgagg acattcttca gaaaggggca 4 

gggctgttag aaggggcttc gctgcctgtt ggaggtgaaa atacccatac agtgatcact 4 

gctcacagag gattgccaac ggcagaattg ttcagtcaat tggataagat gaaaaaaggg 5 

gatatctttt atcttcacgt tttagatcag gtgttggcct accaagtgga tcagatagtg 6 

acggtggagc cgaatgactt tgagcctgtc ttgattcaac atggggaaga ttatgcgacc 6 

ttgttgactt gtacaccgta tatgattaac agtcatcgtc tgttggtacg tgggaagcgg 7 

attccgtata cggcaccaat tgcagagcgg aatcgagcgg tgagagagcg tgggcaattc 7 

tggttgtggt tattactagg agcgatggcg gtcatccttc tcttgctgta tcgcgtgtat 8 

cgtaatcgac ggattgtcaa aggactagaa aagcaattgg aggggcgtca tgtcaaggac 9 

taa 9 

<210> 30 
<211> 1347 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 30 

aaaataaaaa aaggagttcc ggtgatgaac aatattttag cgtttttaga aacaaaagtc 
gctccgtttg gtgaaaaagt tggcaaccaa cggcatttga aagctattcg tgaaggattt 1 
atgatggcaa tgcctttgat tttagtcggc tctttatttc ttattctaat cagttggcct 1 
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gctgaatagt gttggattgc taagtatctt gacaactatg 








tatctccttg gtcgcttgtt tcggtattgc 


ctacaggttg 








tggtccgtcg gcagggatca tagccttatc 


cagttttgta 






ctcgtttttc 


gagtatggtt tatgataaaa atggggagca ggtcaagcag 






gcgcaatacc 


attttctagc ctgaatgcat cttctttgtt 


tatggcgatt 






tggttacagc 


agagatttat cgtatgttta tccagcgcgg aattacgata 








agatgtagta agtaaatcat tttcagctct 


tttatctggt 




ttta.cta.ctt 


ttgttttgtg 


ggctttggtc ttaaaaggtc ttgaagcggc 


aggagttgca 




ggaggtctca 


acggactcct 


aggtgcaatt gttggaacac cgcttaagtt aattgcagga 








atgtgttatt gtaaactcat tcttttggtt 


ctgtggagtt 








tgcttttgta gacccagttt ggttacaatt 


tactacagaa 






ctgtggctgc 


aggacaaaca ctccaacaca ttattacatt 


accgtttaaa 




gatttatttg 


tatttattgg 


tggcggtgga gcgactattg gtcttgcgat 


ttgtctcttc 




ctatttagta 


agagtcgtgc 


gaataaaaca ttaggtaagc tagctattat 


accgtctatt 




ttfcaatatca 


atacagctat 


tctatttacg tttccaacag ttttaaatcc 


gattatgctg 




attccgttta 


ttgctactcc 


tacaatcaat gccttgatta cctatgtatc 


aatggctgta 


1140 


ggattagtac 


cctatacaac 


aggtgtaatc cttccgtgga caatgccacc 


gattatagga 


1200 


ggcttccttg 


caacaggggc 


tagttggcga ggagctctat tacaagttgt tttgattttg 


1260 


gtttctgtag 


caatttatta 


tccattcttc aaaattgcag ataaacgcaa 


tcttgaaaaa 


1320 


gaaaaagcta 


ctgttggagg 


gaaataa 




1347 



<210> 31 ' 
<211> 1701 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 31 

attttttata ggaggagttt tatggataag ctagtcgctg ccattgaaaa gcaacaaggg 60 
aaatttgaaa aaatttctac taataactat atgatggcta ttaaagatgg attcattgct 120 
actatgcctt taattatgtt ttcaagcttt ttgatgatta ttattatgat tcctaaaaat 180 
ttcggagtag agttaccgag tccagctatt gtctggatga gaaaagtgta tatgttaacc 240 
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actgttggaa agtcattagt 


tggaaatgtt 










aatgatattt ctgcaatgtt 


ggcagccata 










gtagttgatg agaagacggg atctacaagt 










ttgataactt cgtttgtcag 


tgcctttatt 






ttfcaccgatfc 


ctgtattaag 


cgagacatta ctattcattt 


acctaaggaa 




g 1 1 c c t gggg 


ctatatcaca 


agcttttaga 


gatattttcc ctttttcttt 


tgttttactt 










tttagtttag atgttccttt 


tgcccaagta 






tattgacfccc 


caccc Luaag 


ggggcagaat catatcctgc 


tatgatgttg 






tgtgtgcttt 


gctttggttt 


gttggaattc atggaccatc 


tattgtctta 










atggaagaga atgctcaact 


tcttgcaaat 










aatttcggga attatatcgc 


tgctattgga 




ggaacggggg 






attttgattt tctttatgcg 


gtctaaacaa 










cctgttttat ttgcggtaaa 


tgaacctctt 










tatctttttg tccctttttt gatgactcca 




ccagtgaatg 


tatttctagg 


aaagg ticttt 


attgatttct ttggaatgaa tggattttat 




afcccagfctac 


cttggacctt 


tcctggtccc 


ttgggattgt taattggaac 


gaattttcaa 






ttgtattttt 


atctttgatfc 


ttagttgtcg acatattgat 


ttatttgcca 




ttctgfcagag 


cgtatgatag 


acagt tactg 


gtgaaagaag atattgcaag ctcaaatgat 




attattfctag 


aggaggatac 


aagtgaaata 


attcctggtg agatagatga aataaaaagt 




aaggagttga 


aagfcactggt 


tctttgtgca 


gggtctggaa caagtgcgca attagccaat 




gcaattaacg 


agggggctaa 


cttaacagag 


gttagagtga ttgcgaattc 


aggagcgtac 


1500 


ggagctcatt 


atgatattat 


gggtgtttat 


gatttaatta ttctggcccc acaagttcgg 


1560 


agttattata 


gagagatgaa 


ggtggatgca 


gaaagattag gtattcagat 


agttgctacc 


1620 


agaggaatgg 


aatatattca 


tttaacaaag 


agtccaagta aagccttaca atttgtattg 


1680 


gagcattacc 


aagctgtgta 


g 






1701 



<210> 32 

<211> 1704 

<212> DNA 

<213> Streptococcus pneumoniae 
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<400> 32 
gaagcgaatg 


aagagagfcaa 


gatgaaagaa 


gctataattg 


agtggaagga 


tttctctttc 


60 


cggtatgaaa 


cacaacaaga 


accgaccttg 


caagggatag 


acttgaccat 


ttacaaggga 


120 


gagaaagtct 


taattgttgg 


accatctggg 


tcaggtaaat 


ctaccttggg 


tcagtgtttg 


180 


aatgggatta 


ttcccaatat 


ttacaagggt 


cagacatatg 


gagaattttt 


gataaagggt 


240 


caagtagcct 


ttgatatgag 


catctatgat 


aagtctcatc 


tggttagcac 


agttttgcag 


300 


gatacagatg 


ggcagtttat 


tggcttgtct 


gtggcagaag 


atttggcgtt 


tgctctggaa 


360 


aatgatgtga 


cagccctaga 


tgagatgaaa 


ggtcgtgttt 


ataaatgggc 


tgaaaagctg 


420 


gaccttcttc 


ctttactgga 


tcagcgtcct 


caggatttgt 


caggtggaca 


aaagcagcga 


480 


gtcagtctgg 


ctggtgtctt 


gattgatgaa 


agtccgattc 


tcttgtttga 


tgagccactc 


540 


gccaatctag 


atcccaagtc 


aggtcaggat 


attatcgaat 


tgattgacca 


gattcataag 


600 


gaagagggga 


cgacgactct 


tattatcgag 


caccgtttgg 


aggacgttct 


gcatcgccct 


660 


gtggatcgga 


ttgtcttgat 


aaacgatggt 


cgtatccttt 


ttaatgggag 


ccctgaccag 


720 


ttgcttgcga 


ctgatttatt 


gactcaaaat 


ggaattcgag 


aaccccttta 


tctaacgact 


780 


ctccgtcaat 


taggtgtgga 


cttagtcaag 


gaagaacaat 


tagcgaatct 


ggataacttg 


840 


tctatctcaa 


aaggtcaggt 


tcagttgcag 


aatgaactgg 


caaaagaaac 


cccagcattg 


900 


cagtcactct 


ttagactaga 


ggaagtatct 


ttttcttatg 


atgatagacc 


gattttaaaa 


960 


tccctacatt 


tagatattaa 


aaagggtgaa 


aagattgcta 


ttgtcggaaa 


aaatggagca 


1020 


gggaaatcaa 


ctctagccaa 


ggctataagt 


agctttattc 


agacggaagg 


acgctatctt 


1080 


tgggaaaaac 


aggatataaa 


aggcgattct 


gttgcagagc 


gggcggaacg 


agtaggatat 


1140 


gtgctacaaa 


atcctaatca 


aatgatttca 


accaatatga 


tttttgatga 


ggtggctcta 


1200 


gggctccgtt 


tgcgaggtgt 


ggatgagaag 


gaaattgaaa 


cgagagtata 


tgaaaccttg 


1260 


aaaatctgtg 


gactttatga 


attccgtaat 


tggcctattt 


ctgccctgtc 


atttggtcag 


1320 


aaaaaacgtg 


tcaccattgc 


ttcaattttg 


gtcttaggag 


ctgaaattat 


tctcctagat 


1380 


gaaccgactg 


caggtcaaga 


tcagaagaac 


tatactgaga 


ttatggaatt 


tctcgaagag 


1440 


ttacatcaaa 


aagggcatac 


cattgtcatg 


attacccatg 


atatgcaatt 


gatgctggat 


1500 


tattcagacc 


gggtccttgt 


catggtggat 


ggagaattga 


ttgccgatac 


tgttccagcc 


1560 


agtctgttga 


gcgatcctga 


gctgttagta 


aaagccaatc 


taaaagaaac 


ctccatcttt 


1620 
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aacttggcta agaaactaga tgtggatcca ctggatttaa cggcatttta caaagaaagg 1680 
agagagggat gcaagctaaa ttaa 1704 

<210> 33 

<211> 1668 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 33 





gcgattattt 






ttatttcact 
















cctttctttg 


gagttctggfc 


tggtattggg atgactgctc taattcagtc tagttctggt 




gtaacagtta 


tcacagtcgg 


cctggtcagt gccggtctct taaccttacg tcaggctatc 




gggattgtca 


tgggtgctaa 
























tgagcggcgc 






atgattgagc 




tcctgfctttg ggtgtccttg tcggtactgg cttgaccttg 






cttcttcggc 


taccattggg attttacaaa acctctacgc cggcaatcta 




attgatctac 


agggagcttt 


gccagttcta tttggtgaca atatcgggac aaccattaca 


720 


gccatcattg 


cctctttagg 


ggctaatatt gcagctaaac gggtagcagg agctcatgtt 


780 


gccttcaacg 


ttatcggaac 


agttgtctgc gttatttttc tagttccttt tactgtcctg 


840 


attcattggt 


ttgaagctac 


gctaaatcta gcaccggaaa tgaccatcgc ctttgctcac 


900 


ggaaccttta 


atattaccaa 


caccattgtc caatttccat ttatcggagc tctggcttac 


960 


tttgtaacca 


agattattcc 


tggagaggac gaggttgtca aatacgaacc cttatatctt 


1020 


gatgaacatt 


tcatcaaaca 


ggccccatct atcgctctag gaaatgctaa gaaagagctc 


1080 


ttgcacttag 


gaaactacgc 


tgctaaagcc tttgaccttt cctataagta catcattgac 


1140 


ttggatgaaa 


aagttgctga 


aaaagggcat aaaaccgaag aagcaattaa caccatcgat 


1200 


gagcaattaa 


cacgttatct 


cattgccctt tcaagcgaag ctctcagcca aaaagaaagt 


1260 


gaagtgctta 


ccaatatcct 


tgattcctcc cgtgatttgg aacggattgg agaccacacg 


1320 
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gaggctctac tcaatctgac 


tgactatctt 


caacggaaaa 


atgttgaatt 


ttctgatgcc 


1380 


gccttgaaag aattagagga agtttaccgc 


caaactagtg 


actttatcaa agatgctctg 


1440 


gatagtgtgg aaaacaatga 


tattgaaaaa 


gcacgcagtc 


ttgtagaacg 


tcatgaagca 


1500 


atcaataaga tagaacgtgt 


tctcagaaaa 


acccacatca 


aacgcctcaa 


caaaggcgaa 


1560 


tgttcaacac aagctggggt 


caactttatc 


gacatcatct 


cacactacac 


tcgtgtatca 


1620 


gaccacgcta tgaaccttgc 


tgaaaaggtt 


tttgcagaac 


aaatctaa 




1668 



<210> 34 

<211> 4989 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 34 



gaggagaaaa 






agacgttgtc 


gttacagtat 


tcgtaagtta 




tcagtaggag 




gatgattggt 




ttgctggtcc 


agccttggct 








aaatagcgga 


gctaatacag 


agcttgtttc 


aggagagagt 




gagcattcga 


ccaatgaagc 


tgataagcag 


aatgaagggg 


aacatgctag agaaaacaag 


240 


ctagaaaagg 


cagaaggagt 


agcgatagca 


tctgaaactg 


cttcgccagc 


aagcaatgaa 


300 


gctgcaacta 


ctgaaactgc 


agaagcagct 


agcgcagcta 


aaccagagga aaaagcaagt 


360 


gaggtggttg 


cagaaacacc 


atctgcagaa 


gcaaaaccta 


agtctgacaa 


ggaaacagaa 


420 


gcaaagcccg 


aagcaactaa 


ccaaggggat 


gagtctaaac 


cagcagcaga 


agctaataag 


480 


actgaaaaag 


aagtccagcc 


agatgtccct 


aaaaatacag 


aaaaaacatt 


aaaaccaaag 


540 


gaaatcaaat 


ttaattcttg 


ggaagaattg 


ttaaaatggg 


aaccaggtgc 


tcgtgaagat 


600 


gatgctatta 


accgcggatc 


tgttgtcctc 


gcttcacgtc 


ggacaggtca 


tttagtcaat 


660 


gaaaaagcta 


gcaaggaagc 


aaaagttcaa 


gccttatcaa 


acaccaattc 


taaagcaaaa 


720 


gaccatgctt 


ctgttggtgg 


agaagagttc 


aaggcctatg 


cttttgacta 


ttggcaatat 


. 780 


ctagattcaa 


tggtcttctg 


ggaaggtctc 


gtaccaactc 


ctgacgttat 


tgatgcaggt 


840 


caccgtaacg 


gggttcctgt 


atacggtaca 


ctcttcttca 


actggtctaa 


tagtattgca 


900 


gatcaagaaa 


gatttgctga 


agctttgaag 


caagacgcag 


atggtagctt 


cccaattgcc 


960 


cgtaaattgg 


tagacatggc 


caagtattat 


ggctatgatg 


gctatttcat 


caaccaagaa 


1020 


acaactggag 


atttggttaa 


acctcttgga 


gaaaagatgc 


gccagtttat 


gctctatagc 


1080 
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aaggaatatg 


ctgctaaggt 


aaaccatcca 


atcaagtatt 


cttggtacga 


tgccatgacc 


1140 


tataactatg 


gacgttatca 


tcaagatggt 


ttgggagaat 


acaactacca 


attcatgcaa 


1200 


ccagaaggag 


ataaggttcc 


ggcagataac 


ttctttgcta 


actttaactg 


ggataaggct 


1260 


aaaaatgatt 


acactattgc 


aactgccaac 


tggattggtc 


gtaatcctta 


tgatgtattt 


1320 


gcaggtttgg 


aattgcaaca 


gggtggttcc 


tacaagacaa 


aggttaagtg 


gaatgacatt 


1380 


ttagacgaaa 


atgggaaatt 


gcgcctttct 


cttggtttat 


ttgccccaga 


taccattaca 


1440 


agtttaggaa 


aaactggtga 


agattatcat 


aaaaatgaag 


atatcttctt 


tacaggttat 


1500 


caaggagacc 


ctactggcca 


aaaaccaggt 


gacaaagatt 


ggtatggtat 


tgctaaccta 


1560 


gttgcggacc 


gtacgccagc 


ggtaggtaat 


acttttacta 


cttcttttaa 


tacaggtcat 


1620 


ggtaaaaaat 


ggttcgtaga 


tggtaaggtt 


tctaaggatt 


ctgagtggaa 


ttatcgttca 


1680 


gtatcaggtg 


ttcttccaac 


atggcgctgg 


tggcagactt 


caacagggga 


aaaacttcgt 


1740 


gcagaatatg 


attttacaga 


tgcctataat 


ggcggaaatt 


cccttaaatt 


ctctggtgat 


1800 


gtagccggta 


agacagatca 


ggatgtgaga 


ctttattcta 


ctaagttaga 


agtaactgag 


1860 


aagaccaaac 


ttcgtgttgc 


ccacaaggga 


ggaaaaggtt 


ctaaagttta 


tatggcattc 


1920 


tctacaactc 


cagactacaa 


attcgatgat 


gcagatgcat 


ggaaagagct 


aaccctttct 


1980 


gacaactgga 


caaatgaaga 


atttgatctt 


agctcactag 


cgggtaaaac 


catctatgca 


2040 


gtcaaactat 


ttttcgagca 


tgaaggtgct 


gtaaaagatt 


atcagtttaa 


cctaggacaa 


2100 


ttaactatct 


cggacaatca 


ccaagagcca 


caatcgccga 


caagcttttc 


tgtagtgaaa 


2160 


caatctctta 


aaaatgccca 


agaagcggaa 


gcagttgtgc 


aatttaaagg 


caacaaggat 


2220 


gcagatttct 


atgaagttta 


tgaaaaagat 


ggagacagct 


ggaaattact 


aactggctca 


2280 


tcttctacaa 


ctatttatct 


accaaaagtt 


agccgctcag 


caagtgctca 


gggtacaact 


2340 


caagaactga 


aggttgtagc 


agtcggtaaa 


aatggagttc 


gttcagaagc 


tgcaaccaca 


2400 


acctttgatt 


ggggtatgac 


tgtaaaagat 


accagcctac 


caaaaccact 


agctgaaaat 


2460 


atcgttccag 


gtgcaacagt 


tattgatagt 


actttcccta 


agactgaagg 


tggagaaggt 


2520 


attgaaggta 


tgttgaacgg 


taccattact 


agcttgtcag 


ataaatggtc 


ttcagctcag 


2580 


ttgagtggta 


gtgtggatat 


tcgtttgacc 


aagccacgta 


ccgttgttag 


atgggtcatg 


2640 


gatcatgcag 


gagctggtgg 


tgagtctgtt 


aacgatggct 


tgatgaacac 


taaagacttt 


2700 
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cfa.ccttfca.fct 






tggaagctag 


ctaaggaagfc 


ccgtggtaac 


2760 


aa.acjca.ca.ccf 


tgacagatat 


cactcttgat 


aaaccaatca 


cfcgctcaaga 


ctggcgcttg 


2820 














2880 






















































ttattatcgt 




































tggtgttctt 












































aaaactattt 




















































ttgatgtttt 
















agctgcfcaag 


aacaaggtgg 










































































agagcttaca 




acagcgaaac 


agagtctaaa 


agatctggtt 


gctttattga 


aagaagacaa 


gccagcagtc 


4140 


ttttctgata 


gtaaaacagg 


tgttgaagta 


cacttctcaa 


ataaagagaa 


gactgtcatc 


4200 


aagggtttga 


aagtagagcg 


tgttcaagca 


agtgctgaag 


agaagaaata 


ctttgctgga 


4260 


gaagatgctc 


atgtctttga 


aatagaaggt 


ttggatgaaa 


aaggtcaaga 


tgttgatctc 


4320 


tcttatgctt 


ctattgtgaa 


aatcccaatt 


gaaaaagata 


agaaagttaa 


gaaagtattt 


4380 



-32- 



WO 02/083855 



PCT/US02/11524 



ttcttacctg 


aaggcaaaga 


ggcagtagaa ttggcttttg aacaaacgga 


tagtcatgtt 


4440 


atctttacag 


cacctcactt 


tactcattat gcctttgttt atgaatctgc 


tgaaaaacca 


4500 


caacctgcta 


aaccagcacc 


acaaaacaca gtccttccaa aacctactta 


tcaaccgact 


4560 


tctgatcaac 


aaaaggctcc 


taaattggaa gttcaagagg aaaaggttgc 


ctttcatcgt 


4620 


caagagcatg 


aaaatactga 


gatgctagtt ggggaacaac gagtcatcat 


acagggacga 


4680 


gatggactgt 


taagacatgt 


ctttgaagtt gatgaaaacg gtcagcgtcg 


tcttcgttca 


4740 


acagaagtca 


tccaagaagc 


gattccagaa attgttgaaa ttggaacaaa agtaaaaaca 


4800 


gtaccagcag 


tagtagctac 


acaggaaaaa ccagctcaaa atacagcagt 


taaatcagaa 


4860 


gaagcaagca 


aacaattgcc 


aaatacagga acagctgatg ctaatgaagc 


cctaatagca 


4920 


ggcttagcca 


gccttggtct 


tgctagttta gccttgacct tgagacggaa aagagaagat 


4980 


aaagattaa 








4989 



<210> 35 

<211> 1029 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 35 



gcaagcttcc 


ttcctctgat 


tttcaaacaa aaatctctca ttgcttacat tgttctctca 


60 


agcttattgg 


tcactattat 


caatataggt ggttcttact atctccaagg aatcttggat 


120 


gaatacattc 


caaatcagat 


gaaatcaact ttaggaatca tctcagttgg tctggttatc 


180 


acctatatcc 


tccaacaagt 


catgagcttc tccagagatt atctcctaac cgttctgagt 


240 


cagagattaa 


gtattgatgt 


gattttatcc tatattcgcc atatttttga acttcccatg 


300 


tctttctttg 


cgacacgtcg 


tacaggagaa atcatttcac gattcacaga tgctaactct 


360 


attatagatg 


ccttggcttc 


taccattctt tctctttttc tggatgtttc tattctgatt 


420 


cttgtaggag 


gcgtcttact 


ggcacaaaac cctaatctct tccttctttc tcttatttcc 


480 


attcctatat 


acatgttcat 


catcttttct tttatgaaac ctttcgaaaa aatgaaccat 


540 


gatgtcatgc 


aaagtaattc 


tatggttagc tctgccatta tcgaagatat caacgggatt 


600 


gaaactataa 


agtcgctcac 


gagtgaagaa aatcgctatc aaaatataga cagcgaattt 


660 


gtagattatt 


tggaaaaatc 


ctttaagctc agtaaatatt ctattttaca aacgagttta 


720 


aagcagggaa 


caaaattagt 


tctgaatatc cttatcctat ggtttggcgc tcaattagtc 


780 
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atgtcaagta 


aaatttctat 


cggtcagctg attaccttta acacactttt 


ttcttacttt 


840 


acaactccta 


tggaaaatat 


tatcaacctc 


caaaccaaac tccaatctgc gaaggtcgct 


900 


aataaccgtt 


tgaacgaagt 


ctatctagtc 


gaatctgaat ttcaagttca 


agaaaaccct 


960 


gttcattcac 


attttttgat 


gggcgatatt 


gaatttgatg acctttctta 


taagtatggt 


1020 


tttggatga 










1029 



<210> 36 
<211> 288 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 36 

ggtcttgggg taaaaaaaca aaaggcttgc ttttcagcca tagaggaggt catcatgtat 60 
aaacacttat ttttcctaga ttccaaaact ttagatcggt tgacacccta tattctagtc 12 0 
ttggcttctg acaccattgc ttttaatgtt tttgtgctaa cctttgtatc tgcggtggtt 180 
tttaatttcc taaattccat gctagcttta atggctatat tcataggggc tggctatgtg 240 
gtcggatttt ggttactaat actcaatgaa aatcaaagag caaactag 288 

<210> 37 

<211> 648 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 37 



cgtgtaggaa 


gtctgtttgt 


tgaggaggat aattttatgg agttttttga taaatttcat 


60 


gccttgtgtt 


ttggattttt 


agtactaata attgtcatta cagttcctta tacgattaac 


120 


catgggggtt 


tttttcaaaa 


tgaatctgca ttgattcttg taagtcttct tgtaacctcg 


180 


ctgagtgttg 


cttatgctag 


aaagtttgaa atgatttctt ttgggatgtt aagcaagaaa 


240 


caacttttgc 


ttttcattgc 


aatctttctt ctaagtgtac ttgagacgct ggtttatatt 


300 


catttcttcg 


ctgtttcttc 


tggctcaggg gtccaacact tggcggaagt cagcagagga 


360 


atttccctgt 


ctttgatttt 


gactacctca gtttttggcc ccatccagga ggaactcatt 


420 


ttcagaggac 


ttcttcaagg 


tgcggttttt gacaattctt ggttagggct tgtgctaact 


480 


tcctctctct 


tttctttcat 


gcatggacct tctaatgtcc cttcgtttat tttttatcta 


540 


cttgggggtt 


tgttgctggg 


ctttgcttat aaaaagagtc aaaacctatg ggtttctact 


600 
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ctagtccata tgctttacaa cagttggcca ctcttatatt atttataa 648 

<210> 38 

<211> 1848 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 38 



gagaatacca 


tgagttataa 


agatacggta 


caaaaaatcc 


tcgatgtaat 


tggaggtgaa 


60 


aaaaatgtca 


atagagttac 


ccattgtgta 


acacgtttaa 


gattagaatt 


aaaagatgaa 


120 


aatttagtca 


atgatgatga 


tgtgaagaag 


ataccaggtg 


taataggtat 


tatgaaaaag 


180 


aatggacaat 


atcaaattat 


acttggtaat 


gatgtagcta 


attattataa 


agaattcgtt 


240 


aaacttggca 


attttgaatc 


cgattcagtt 


gttcaagggc 


acaaagggaa 


tattttagaa 


300 


agaatcattg 


agtatatcgc 


tggttccatg 


actccaatca 


ttccagcaat 


gttaggggga 


360 


ggtatgttga 


aagtcttggt 


aatcatttta 


ccaatgcttg 


gtatattgca 


atcagattct 


420 


cagactattg 


cttttttgac 


attttttggg 


gatgctccat 


attatttctt 


accgctgtta 


480 


ttagcttatt 


ctgcatcaca 


aaaattaaaa 


gtaacatcta 


cattagctat 


gtctgtagca 


540 


ggtgtacttc 


tccatccaaa 


ttttgttcaa 


atggtgcaat 


cagggaatcc 


tcttagttta 


600 


tttggtgcac 


ctgtgacacc 


agctagttat 


ggttcatcag 


tcgttccaat 


tcttattatg 


660 


gtttggttga 


tgaaatatat 


tgaaaaaata 


attgctaaat 


taacactagc 


tattactaag 


720 


agttttttgc 


aacctacgct 


agtattatta 


gtatcaagct 


gtattgcctt 


agttgtagtc 


780 


ggacctattg 


gagtaattgt 


tggtgaagga 


ttatcaaatc 


tagttgggca 


aatgtatggt 


840 


gtagctggat 


ggcttacatt 


agctattctt 


ggtgctatta 


tgccatttat 


tgttatgact 


900 


ggaatgcatt 


gggcttttgc 


acctattttt 


ttggcggcat 


ctattgctac 


tccagacgta 


960 


ttaattcttc 


cagcaatgtt 


agggtcaaac 


ttagctcaag 


gggctgcttc 


gatggctgtt 


1020 


gcattaaaga 


gtaaaaataa 


taatacaaaa 


caaattgctt 


ttgcagcagg 


tttctcagcc 


1080 


ttacttgcag 


ggattaccga 


acctgcatta 


tatggtgtga 


ctttaaaata 


taaaaaaccg 


1140 


ctttatgcag 


ctatgattgg 


tggtggatta 


gcgggattat 


ttgcaggtct 


tactagtgtt 


1200 


aaagcatatc 


tatttgctgt 


cccatctttg 


atagcgttgc 


ctcaatttat 


ttattctgat 


1260 


gtgccatcaa 


atattgtaaa 


tgctttaatt 


gtggcggtca 


tttcggttgt 


tattaccttt 


1320 


gtattagctt 


atatatttgg 


aatcgatgaa 


gaagagagtt 


ctagcaattt 


agaagttgaa 


1380 
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gctggagttt 


caaataaaaa 


aatgatattt 


tctcctatat 


caggagaaat cattccgtta 


1440 


agcgatgtcc 


aggataaaac 


attttcagat 


aaactaattg 


gagacggagt agcgattatc 


1500 


ccaagtgaag 


gtaaggttba 


tgcaccattt 


gatgggaaaa 


ttacaaatat ttttccgact 


1560 


aagcacgcaa 


ttggattgaa 


gagtgatgag 


ggtgttgagt 


tactaattca tattggatta 


162 0 


gatactgttg 


agctaaaagg 


tcaaggtttt 


attagtcatg 


tagaagaagg agacagagtt 


1680 


ttcaaaaatc 


agttgatttt 


tgaaatggac 


ttgaatttaa 


tcaagactaa aggctacgaa 


1740 


acagttacac 


cagtaafctgt 


aacgaatacc 


aatgattttc 


tagatgtatt agtattacct 


1800 


aataatcaga 


caatcgagca 


ttctaaggaa 


ttactggtaa 


tattataa 


1848 



<210> 39 

<211> 246 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 39 



atagctggca agtgggcaat 


ggtcggaatc 


gccaaatcat 


tttggataaa 


gtcagccaaa 


60 


cgaaccgtcg tttccttgat 


attaaattta 


ttattgctgg 


cagttacact 


gataaaatgg 


120 


ggagccaact cctgcatatc 


ctgcaaggct 


gaaataatgt 


tatcattacc 


cacggctggg 


180 


tttggaggga acacttcaaa tgagagtgac 


ggtgtttggc 


gtgacatatg 


taataacctt 


240 


ttctag 










246 



<210> 40 

<211> 669 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 40 



gaggttacta tggaatctat 


tttagaagtt 


ttaaccccag 


ataacctagt 


ctttatcttt 


60 


aaaggatttg gcttgaccct 


ctatatttct 


ctgattgcca 


tcatcctctc 


tactatcatc 


120 


ggtacggtgc tagctgtcac 


gagaaatggc 


aaaaatcctg 


tcttacgcat 


tatttccagt 


180 


atttatatcg agtttgtgcg 


caacgttccc 


aaccttctct 


ggatttttac 


tatctttttg 


240 


gtgttcaaaa tgaaatccac 


accagcaggt 


attacagcct 


ttactctctt 


tacatcagca 


300 


gccttggctg agattattcg 


aggcggtctc 


aatgccgtag 


acaagggaca 


gtacgaagca 


360 


ggaatgtcac aaggcttcac 


ctcagcccaa 


atcctctact 


acatcattct 


cccacaagcc 


42 0 


atccgcaaaa tgctaccagc 


catcatttct 


cagtttgtta 


ccgtgattaa 


ggataccagt 


480 
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ctcctctact ctgttatcgc cctacaagaa 
cgttatttcg aaccagagca ggtcttcagt 
agctttaacc tagcaatttc tagcctgtct 
gcagaataa 

<210> 41 
<211> 768 
<212> DWA 

<213> Streptococcus pneumoniae 
<400> 41 

agatctctca tggctttagt agaatttaaa 
gcattccgca acatcaatct ccgttttgaa 
tctggctctg ggaagtccac tcttatccgt 
ggaagtctcc tagtcaatgg gcaccaagtt 
cttcgcaagg aagtcggcat ggtttttcaa 
ttagaaaacg taacgcttgc acccattgaa 
aaaaccgccc aaaaatatct ggaatttgta 
gccatgctat ctggtggaca aaaacagcgg 
ccggaactcc tcctctttga tgaaccaaca 
gttctagcag ttatgcagaa actggcgcat 
gaaatgggct ttgctcgaga ggttgcggac 
ttagtagata cgacagatgt cgataacttt 
caattcctca gcaaaattat caaccacgaa 

<210> 42 
<211> 1224 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 42 

gaaatgtacc gttatcaaat tggcattccc 
catgaattag ccaatgtatt acaaagtagt 
catgagaagt ttggtgttta cagggaagaa 



PCT/US02/11524 



ctctttggag 


ccagccaaat 


tctcatgggc 


540 


ctttacatcc 


tgattgccct 


catctacttc 


500 


catatgctag 


ccaaacgttg 


gcaacaagct 


660 








669 


aacgtcgaaa 


aatattacgg 


agactaccac 


60 


aaaggacaag 


ttgttgtcct 


gcttggacct 


120 


acgatcaatg 


gtttagagac 


tgttgacaaa 


180 


gctggtgcca 


gccagaaaga 


tttggtacct 


240 


cattttaacc 


tttatccaca 


caaagctgtg 


300 


gttctaggaa 


ttgataaaaa 


agaagctgaa 


360 


aatatgtggg 


acaagaaaga 


ttcctatccc 


42 0 


atcgccatcg 


ctcgtggtct 


tgctatgcat 


480 


tctgctcttg 


atcctgagac 


tatcggagat 


540 


gatgggatga 


acatgatcat 


cgttacccac 


600 


cgcattatct 


ttatggccga 


cggagaagtt 


660 


tttgacaatc 


caagcgaacc 


tcgtgcccaa 


720 








768 


acattagaat 


atgatcagtt 


tgtcaaagaa 


60 


gcttgggagg 


aagttaagtc 


taattggcaa 


120 


aaattactgg 


cgacagctag 


tattttgatt 


180 
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agaactcttc 


cgctaggcfca 


taaaatgttt tacatcccaa gaggacctat atfcggattat 


240 












ttgtgacttt 
















































































ctttgagttt 


ggaatttggt actacctctg tcaatatata tgctggtatg 




gatgatgatt 


ttaaacgtta 


caatgcacca attttaactt ggtatgaaac ggctcgctat 


1020 


gcctttgaac 


gaggtatgat 


ctggcaaaat ttaggtggtg ttgaaaactc tctcaatggt 


1080 


ggactttatc 


attttaagga 


aaaatttaat ccaacgattg aagaatactt gggtgaattt 


1140 


acaatgccca 


ctcatcctct 


ctatcctctg ttaagacttg ctcttgattt ccgtaaaaca 


1200 


ttaagaaaaa 


aacatagaaa 


gtaa 


1224 



<210> 43 
<211> 636 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 43 

tgcttttttc agactcctaa tcgtggtata ctaggtcagt attttataaa tatgaaggag 60 
atttttatgg ctaaaaaagg taccctaaca ggtttgctcc tgtttggaat attttttggt 12 0 
gcggggaact tgatttttcc gccttctcta ggtgctctat ctggagaaca ttttcttcct 180 
gccatcgcag gttttgtctt ttcaggcgtt ggtatcgccg tcttgaccct tattattgga 240 
acgctaaatc ctaaaggata tatctacgag atttcaacga agatagcgcc ttggtttgcg 300 
actctttacc tctcagttct ttacttgtca atcggtccat tctttgctac cccacgtact 360 
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gctacaacag 


cttacgaagt 


agggattagc 


ccccttttgt 


cggatgcaaa 


taaaggactt 


420 


ggcttgattg 


tatttacggt 


tctgtatttt 


gcggcagcct 


atttgatttc 


gcttaatcca 


480 


tcaaaaatct 


tagaccgcat 


tggacgtatt 


ttaacgccag 


tctttgcaat 


tttgattgtt 


540 


atcttggtcg 


ttctgggagc 


tatcaaatat 


ggtggaacaa 


gtcctcaagc 


tgcttcactg 


600 


cttatcaagc 


ttctgccttt 


ggtacaggtt 


tcctag 






636 



<210> 44 

<211> 2049 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 44 



tccatcaaaa 


atcttagacc 


gcattggacg 


tattttaacg 


ccagtctttg 


caattttgat 


60 


tgttatcttg 


gtcgttctgg 


gagctatcaa 


atatggtgga 


acaagtcctc 


aagctgcttc 


120 


actgcttatc 


aagcttctgc 


ctttggtaca 


ggtttcctag 


aaggttacaa 


taccttggac 


180 


gcccttgcct 


cagtggcctt 


tagcgtaatc 


gcagttcaaa 


ccttgaaaca 


acttggattt 


240 


tcaagtaaga 


aagaatacat 


ttcaactatt 


tgggttgttg 


gtatcgttgt 


tgcccttgcc 


300 


ttcagcgctc 


tttacatcgg 


tttaggtttt 


cttggaaatc 


atttcccagt 


accagctgaa 


360 


gcgatgaagg 


gtggaacacc 


aggtgtttac 


atcttgtcac 


aagccactca 


agaaatcttt 


420 


ggctcaacag 


ctcaactctt 


ccttgcagct 


atggttaccg 


taacctgctt 


cacaacgact 


480 


gttggtttga 


ttgtgtcaac 


agctgagttc 


tttaatgagc 


gcttcccaca 


aatcagctac 


540 


aaggtttatg 


cgacagcctt 


taccttgatt 


ggatttgcta 


ttgccaattt 


gggtcttgat 


600 


gcgattatca 


agtactcaat 


tccagtactg 


gttatcttgt 


acccaatcac 


gattgctatc 


660 


gttatgattg 


tcattgtcaa 


caaatttgtt 


gccctttcaa 


aaccaggtat 


gcagttgaca 


720 


attgctgtgg 


ttacagttat 


tgccattgca 


agcgtactag 


gaagctcgtt 


aaggttgagt 


780 


ttcttgcaaa 


tcttgttagc 


gttcttcctt 


ttgccaaggc 


atctctccca 


tggttggtgc 


840 


cagccattgt 


tggaatcttg 


ctctcattgg 


ttctaccaaa 


caagcaagaa 


agcgatgttt 


900 


ttgaaatgga 


ataatcactt 


aaatcacttt 


tgtagccaag 


tctacaggag 


tgattttctt 


960 


tttttatccg 


atgataaatg 


tgttataata 


ggtagcgaaa 


gaggtgaaga 


aatgaatcaa 


1020 


acagtagaat 


atatcaaaga 


actgacagcc 


attgcgtcgc 


caacaggctt 


tactcgtgag 


1080 


attgcggact 


atttagtcaa 


gactctagaa 


ggttttggtt 


accagccggt 


tcgcacatcc 


1140 
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aagggcggtg 


tcaatgtaac 


tattaaaggt 


caaaatgatg agcaacatcg 


ctatgtgact 


1200 


gcccatgtag 


atacgcttgg 


tgctattgtc 


cgtgctgtca aaccagacgg 


ccgtctcaaa 


1260 


atggaccgta 


tcggtggctt 


tccttggaac 


atgattgaag gagaaaactg 


taccattcat 


1320 


gtggctagca 


caggtgaaaa 


agtatcagga 


accatcctca tccaccaaac 


ttcttgccafc 


1380 


gtctataagg 


atgcaggaac 


tgcagaacgc 


acgcaagaca atatggaagt 


gcgtttggac 


1440 


gccaaagtaa 


ctagtgaaaa 


agaaactcgfc 


gctcttggca ttgaggtcgg 


tgattttatc 




agttttgacc 


cacgaactgt 


cgtgacagag 


acaggtttta tcaagfcctcg 


ccatttggat 


1560 


gacaaggtca 


gtgcggcgat 


tttgctcaat 


ctccttcgca tttataagga 


agagaagatt 


1620 


gaattgcccg 


taacaactca 


ttttgctttt 


tcagtctttg aagaagtggg 


acacggtgca 




aactctaaca 


ttcctgctca 


ggtagtagaa 


tatctggctg tggatatggg agccatggga 


1740 


gatgaccagc 


aaacagacga 


atatacagtg 


tctatctgtg tcaaggatgc 


ttctggacct 


1800 


tatcactatg 


acttccgtca 


acatttggtg 


gctttggcga aagagcaaga 


tattccattt 


1860 


aagctggata 


tctatccatt 


ttatggttcg gacgcttcag cggctatgtc 


tgcaggggca 


1920 


gaagtcaaac 


acgcccttct 


cggtgctggt 


atagagtcta gccattccta 


tgagcgtacc 


1980 


catattgact 


cggtgatcgc 


aacagaacga 


atggtcgatg cttatcttaa gagcacgttg 


2040 


gtggactaa 










2049 



<210> 45 
<211> 1032 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 45 

aaacacaatg ttgctattcc ttacgatagg gagatagata tggcaatgat agaagtggaa 60 
catcttcaga aaaattttgt gaagactgtt aaggaaccgg gcttgaaggg ggctttgcgc 120 
tcctttattc atcctgaaaa gcagaccttt gaagcggtca aggatttgac ctttgaggtt 180 
ccaaaagggc agattttagg atttatcggg gcaaatggtg ctgggaagtc gacaaccatt 240 
aaaatgctga caggaatttt gaaaccaaca tctggttttt gtcggattaa cggcaagatt 300 
ccccaggaca atcggcaaga ttatgtcaaa gatattggcg tagtctttgg acaacgcacc 3 60 
cagctatggt gggatttggc tctgcaagag acctacactg tcttaaaaga gatttatgat 420 
gtgccagact cgctctttca taagcgtatg gactttttga atgaagtctt ggatttgaag 480 
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aggatcccgt gcggactctt tcactgggac 


aacggatgcg ggcggatatt 






tgctccacaa tcccaaggtt cttthtttag 


atgagccgac cattggtttg 




gacgfcttcgg 


ttaaggataa tattcgtcgg gcaafctactc 


agatcaatca agaggaagaa 




actaccafctc 


ttttgaccac tcacgatttg agtgatattg agcaactttg tgatcggatt 




ttcatgattg 


acaaggggca agagattttt gatggaacgg 


tgagccaact caaggagacc 


780 


tttggtaaga 


tgaagactct ctcttttgaa ctgctaccag gtcaaagtca tctcgtctct 


840 


cactatgacg 


gtctgtctga tatgaccatt gatagacaag 


gaaacagcct caacattgaa 


900 


tttgatagtt 


ctcgctacca gtcagctgac attatcaagc 


aaaccctgtc tgattttgaa 


960 


atccgcgatt 


tgaagatggt ggatacggat attgaggata 


ttatccgtcg cttctaccga 


1020 


aaggagctct 


ag 




1032 



<210> 46 

<211> 1509 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 46 



cattcatata 


acatcaaaaa 


gggaggaact gttatggatg caatctttga cctaatcgga 


60 


aaggttttca 


atcccatctt 


agaaatgggt ggacctgtca tcatgttaat 


cattttgaca 


120 


gtattggctt 


tactttttgg 


agtgaaattc tccaaagcgc ttgaaggtgg 


tatcaaactt 


180 


gccatcgctc 


ttacaggtat 


cggtgctatc atcggtatgc taaacactgc 


tttctcagca 


240 


tcactagcaa' 


aattcgttga 


aaacactggt atccaattga gtattaccga cgttggttgg 


300 


gcaccacttg 


ctacaatcac 


ttggggttct gcttggacac' tatacttctt 


gctcatcatg 


360 


ttgattgtca 


acatagtgat 


gctagctatg aagaaaacag atacacttga 


tgtcgatatc 


420 


tttgatatct 


ggcacttgtc 


tatcacaggt ctcttgatta aatggtatgc 


tgataacaat 


480 


ggtgtgagtc 


aaggggtttc 


actctttatt gctacagcag ctatcgtcct 


tgtcggtgtg 


540 


ttgaaaatta 


tcaactctga 


cttgatgaaa cctacatttg atgaccttct 


taacgcccca 


600 


agttcatcac 


caatgacatc 


aactcacatg aactacatga tgaacccagt 


tatcatggtt 


660 


ttggataaga 


tttttgaaaa 


attcttccca ggccttgata aatatgactt 


tgatgctgct 


720 


aaattgaaca 


agaaaatcgg 


tttctgggga tctaaattct tcatcggttt 


catccttggt 


780 


atcgttatcg 


gtattatggg 


aactccacat ccaattgcag gtgttgcaga tgcagataaa 


840 
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atggttgtct 


cttggtttga ctgccggtgt atctttggaa 


900 






atggttcatc 


gcagccgtag aaccactatc acaaggtatt 








tcttcaagga 


cgtaaattca atatcggtct tgactggcca 








aatctgggct 


tgtgccaacg tacttgcacc aatcatgttg 






tgctfcctttc 


aaaagttgga aatggtatct tgccacttgc aggtatcatc 








tctcttggtt 


gtaactcgtg gtaaattgct ccgtatgatt 




atcttcggaa 


cactcttgtt 


gccactcttc 


cttctttcag gtacacttafc tgcaccattt 




gcaacagaac 


ttgctaaagg 


tgtaggtgcc 


ttcccagaag gtgtgagcca aactcaatbg 


1320 


attactcact 


ctactcttga 


aggaccaatc 


gaaaaacttc ttggttggac aattggtaac 


1380 


actacaactg 


gtgatatcaa 


agcaatcctt 


ggtgcagtag tcttccttgt attctafcatc 


1440 


ggtatctttg 


cttggtacag 


aaaacaaatg atcaaacgta acgaagagta cgcagcaaaa 


1500 


gcaaaataa 








1509 



<210> 47 
<211> 366 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 47 

tacaatatgg gtatgatttt aatgaaatta gcatctattt tattattgat actgacctta 60 
gtcgtctgca ttatcctaac caaacttttt agattaaaaa aactaggacg aaactttgcg 12 0 
gatttggctt ttccagtctt ggtatttgag tattacttga ttacagctaa aacctttacc 180 
cataatttcc tccctagact ggggctagcc ctctcgatcc tagccattat tctcgtcttt 240 
ttcttccttt tgaaaaaacg cagcttttac taccctaaat ttatcaaatt cttctggcgt 300 
gcaggattct tattaaccct tatcatgtat atagaaatga ttgttgaatt gttcttaatg 360 
aaatag 366 

<210> 48 
<211> 729 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 48 

aatgatagga ggaactttat gggtcatatt ttcttttttc taagtgtctt tttggcaggg 60 
attctatcct tcttttctcc ttgtatctta cctttgttac cggtctatac aggagtgtta 120 
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ctagatgata 


aggatggtgc tcaggcttct 


agcggcaaat 


tttcaatctc 


agttactagt 


180 


ttattacgaa 


ctctggcctt tatagcagga 


atttccttta 


tatttatttt 


gttgggctat 


240 


ggagctggtt 


ttttaggcga tttgctttat 


gcttcttggt 


tccaatatct 


tactggggca 


300 


attattatcc 


ttcttggttt gcaccaaatg 


gagattctac 


actttaaggg 


gctttataag 


360 


gaaaagaggc 


tacaactgca aggacagggg 


caaaatggta 


agggctatag 


tcaggcattt 


42 0 


ttattgggct 


tgacctttag ttttgcttgg 


acgccttgcg 


tggggccggt 


tctggggtct 


480 


gttttggcct 


tggcggcttc aggtggttca 


ggagcttggc 


agggagctgg 


tctcatgttg 


540 


gtgtatacgc 


tgggcttggc gctaccattc 


ttgcttctag 


ctctgacctc 


tagttatgtt 


600 


ttgaaacatt 


tccgaaaact tcatccctat 


ctcggaatcc 


tcaaaaaagt 


gggtggtttt 


660 


ctcattattg 


tgatgggctt cttggttctg 


tttggaaatg 


cttcaatttt 


aagtcaatta 


720 


tttgaataa 










729 



<210> 49 

<211> 303 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 49 



ttttggacga 


ctagccagtg ccgttacatg 


ggcatgacca 


atctctctca 


aaatagggcg 


60 


aatcggaacc 


tgaacatgct 


tgacatgcat 


gccaattgca 


gtgtctccga 


tatccaatcc 


120 


agcatgagcc 


ttgataaatt 


caacctcaac 


tggatcctgc 


ataaacttaa 


aggctgccaa 


180 


ctgccccgaa 


cctcctgcat 


gaagagtagg 


atggacactg 


acaatttcca 


gaccaaactg 


240 


ctctgccacc 


tgacgttcaa 


caacgagagc 


ccgattgaca 


tgctcacaac 


cttgaactgc 


300 


taa 












303 



<210> 50 
<211> 1014 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 50 

ttatgggaga aagaaatgaa taaacgtcta ttttcaaaaa tgagtctggt gacgttgcca 60 
attttagcct tgttttcaca atcagttttg gcggaagaaa acatccattt ttcgagctgt 120 
aaggaagctt gggcgaatgg ctattcggat attcacgagg gagaacctgg ttattctgcc 180 
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aagttagacc 


gtgatcatga 


tggtgtggct 


tgcgaattga 


aaaatgctcc 


taagggtgct 


240 


tttaaagcaa 


aacagtcaac 


ggctattcaa 


atcaacacaa 


gttcagcaac 


aacaagtggt 


300 


tgggttaagC 


aggacggcgc 


ttggtactac 


tttgatggaa 


atggaaatct 


agtgaaaaat 


360 




gaagctatta 










420 






ttggtattat 










gcatggcaag 


gagcttatfca 


ccttaaatca 


aacggtaaaa 


tggcacaagg 


tgagtgggtt 


540 


tatgattctt 












600 
















tatgatgcca 


cctatcaagc 


ttggtattat 








720 


acatggcaag 


gaaattacba 


tctaaaatcg 


gatggtaaaa 


tggctgtcaa 


tgaatgggtt 


780 


gatggtggac 


gttattatgt 


tggcgctgac 


ggagtttgga 


aggaagttca 


agcaagtaca 


840 


gcttcttcta 


gtaatgatag 


caatagtgaa 


tattctgctg 


ctttaggaaa 


ggcaaaaagt 


900 


tataattcgt 


tattccacat 


gtcaaaaaaa 


cgtatgtata 


gacaattaac 


ttctgatttt 


960 


gataaatttt 


caaatgatgc 


agctcaatat 


gccattgatc 


atttagatga 


ttaa 


1014 



<210> 51 

<211> 1239 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 51 



atgattgaaa cggagaaaaa 


agaggagcga 


gtcctgctga 


ttggtgtgga 


attgcagggt 


60 


atggacagtt ttgacctctc 


catggaagaa 


ttggctagtt 


tagcgaaaac 


ggcaggggca 


120 


gtcgttgtag atagctacag acaaaaacgt 


gaaaaatatg 


attccaagac 


cttcgtcggc 


180 


tctggtaagt tggaagagat 


tgcgcttatg 


gtggatgcag 


aagaaatcac 


tactgtcatc 


240 


gtcaacaatc gtctgacccc 


aaggcagaat 


gtcaatctag 


aggaagttct 


cggtgttaag 


300 


gtcattgacc gtatgcagtt 


gattttggat 


atctttgcca 


tgcgggctcg 


aagccatgaa 


360 


gggaagctcc aagtccacct 


agcccaactc 


aaataccttt 


tgcctcgctt 


ggttggtcag 


420 


gggattatgc tcagccgtca 


ggcaggggga 


attggttccc 


gtggtcctgg 


tgaaagccag 


480 


ctggagctga accgtcgtag 


cgttcgcaat 


caaatcacgg 


atatcgagcg 


ccagctcaag 


540 


gtggttgaga aaaatcgtgc 


gactgtcaga 


gaaaaacgtt 


tggagtctag 


cacttttaag 


600 
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attggtttga 


ttggttatac 


taatgctggg 


aaatcaacta tcatgaacat cttgaccagt 


660 


aagacccagt 


atgaagcaga 


tgagctcttt 


gcgactctgg atgcgacaac caagagtatt 


720 


catctgggag 


gcaatctcca 


agtaactttg 


acagataccg ttggctttat ccaagatttg 


780 


ccgacagagt 


tggtgtccag 


tttcaagtca 


acctfcggaag aaagcaagca fcgtggaccfct 


840 


ctggttcatg 


ttatcgatgc 


tagcaatcct 


taccacgagg agcatgaaaa aacggttctc 


900 












gatttggtgg 


aggatttcac 


gcctacccaa 


acgccatata ccctcatttc tgccaagtct 


1020 


gaggacagtc 


gtgaaaactt 


gcaagcatta 


ttgctagata agattaagga aatttttgaa 


1080 


gcatttaccc 


tgcgagtgcc 


tttttcaaag 


tcctacaaga ttcatgattt agagagtgtt 


1140 


gcaattctgg 


aagaacgtga 


ttatcaggaa 


gacggcgaag tgattacagg ctacatttcc 


1200 


gagaaaaata 


aatggaggtt 


agaagaattt 


tatgactga 


1239 



<210> 52 
<211> 267 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 52 

aaagagagaa agatggtcta tttagtccta 
gcgacaccag aaagcattaa agggactgtc 
gcactcttga ttttattggt tctatctttt 
ttcctagcaa tagccatgtt gatcctagct 
ccagtcaaaa aaagtaaaag aagataa 

<210> 53 
<211> 810 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 53 

actataaatg aacaaatttt aatttcggat gagatagata ttgatagtag atattctaga 60 
actaaaggtt actattcgtt attttataat gaagagtata ataaaataca gaataaaaca 120 
gtattagtat taggagcagg agtcttagga tgttatatat ctctaagtct aagtatgtat 180 
ggagtgagga aacttattgt cgctgattac gatataatag aaccatcaaa tttaaatagg 240 
caaattcttt atacagagtc ggatgttggt aaggagaaga ttaatgttct ttctgaaaaa 300 



ggaattttac tgctcctact ctatgtattt 60 
aatatcgtcg ctatggtatg tattttagtg 120 
ctgaaaattt ttcaattacc aacagaaata 180 
tactttagtg ttagagacat cacactcatg 240 
267 
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atacacaagt 


ataattcaga tgttcaggta 


gtacctattt 


ctattaaagt 


ttcttcagta 


360 


gaagaattag 


aaaaaattgt tgcggaatat 


gggagtatag 


attttatcgt 


taaagcaatt 


420 


gatacgccca 


ttgatattat aaaaattgtc 


aatcaatttg 


ctgtatcgca 


taagatatcc 


480 


tacatatcag 


gagggtttaa tggatgctat 


cttattattg 


ataatatata 


tatccctacc 


540 


atcggttctt 


gctttggttg tcggaatata 


aacaaagata 


taaataagta 


cactttatct 


600 


gataagacaa 


agtggccgac tacaccagag 


atgcctgcta 


ttttgggagg 


gataatgact 


660 


aatttaataa 


ttaaaatatt tctgggatgt 


tataatgaaa 


tcctaataga 


taacgcttac 


720 


gtttataata 


tgagaaatca tgctctaagt 


caagaaaaat 


atgttctgga 


aaacggagaa 


780 


tgtccaattt 


gtaaaaaaat aataaagtga 








810 



<210> 54 

<211> 393 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 54 



aaaaataata 


aagtgaaaga taacaatatt 


agagcgaaaa 


catttattcg 


ttcagtttgt 


60 


ttttgcttat 


tatcaggagg agtagctttt 


ttatctgcta 


ttgggcagtt 


cactgttata 


120 


gaaacacaat 


taatagtatt gttcttgggt 


attatttttg 


ctatatatta 


tgcttactac 


180 


aataaaaata 


ttcaaacatc attggaaaat 


atagtatggc 


ttttttcatc 


gtttgagatt 


240 


ttatttttgc 


ttgttaattt tagaacattt 


attcagttac 


cagtggatat 


ttttattggt 


300 


atgataatat 


ttttaatgct gtggatattt 


attatgttag 


gtatagtgtg 


tcttagttat 


360 


tatataactt 


tattatttag caaggaggct 


tag 






393 



<210> 55 

<211> 750 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 55 



atttttggat 


ttttagaacc 


atctggttct 


ggaaagacca 


caacgattaa 


tattctgact 


60 


gggcagttcc 


ttgccgataa 


aggacaatct 


attattttgg 


gacaaaaatc 


tcaaaattta 


120 


acaagcggtg 


aattaaagag aattggattg 


gttagcgata 


caagtggatt 


ttatgagaaa 


180 


atgtctctgt 


ataacaatct 


tcttttttat 


agtaaatttt 


ataatattag 


taaatcacgt 


240 
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gttgataatt 


tgttaaagcg agtaggatta 


tatgatagtc 


gcaagafcggt 


agcaggaaaa 


300 


fcfcatccactg 


gaatgaggca 


acgaatgctt 


tfcagcacgag 


ctcttatcaa 


caaccccgct 


360 


gtactctttc 


tggatgaacc 


gacctcaggt 


ctagatccca 


caacfcfcctcg 


aacaattcat 


420 


gagtfcaattfc 


tagaattgaa 


aacagcaggg 


acaacgattt 


ttctaacgac 


tcafcgatatg 


480 


aatgaagcaa 


ctcttttatg 


tgattatgtt 


gccttattaa 


ataaagggaa 


attagttgag 


540 


caaggagctc 


cttctgaact 


cattcaaaga 


tataataaag 


ataaaaagat 


taaggttaca 


600 


gattataatg 


ggaatcagat 


aacttttgat 


tttacatcac 


tagaacaggt 


atctcagact 


660 


gatctggaaa 


atattttttc 


aattcattca 


tgtgagccta 


ctttagaaga 


tatttttatc 


720 


acattaacag 


gaggaaagct 


aaatgcttaa 








750 



<210> 56 

<211> 777 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 56 



ggaggagtaa 


ggatggaatt 


tttcatttgt 


aatcttgtac 


gagtcgttca 


atcacctcga 


60 


ttttatatgt 


ctttattttt 


gacccttctt 


tgcatgagtt 


taggaaattt 


ccttgctttc 


120 


aatggtattt 


ataaaattga 


aggtttatcg 


attttttttg 


ccgcttcttc 


tattcgagga 


180 


ttttcaccga 


ttagcctagt 


agctgcactt 


atctgtacac 


tgccctattc 


tagtcagata 


240 


atagaggatg 


ctgagagtca 


ttttctaaca 


gcacaattgt 


gtcgaatttc 


taaaaagaag 


300 


tatctggcta 


ttgtgggtag 


tactgtaatt 


atttcttctt 


ttctagtctt 


ttttctcccc 


360 


tatttattat 


tattaggaat 


taatctttta 


gtgactcctt 


atcaggaaat 


ttatattgga 


420 


gattatagtg 


gtgccttaaa 


agaattattt 


gattccaatc 


agtttctcta 


tagtcttgta 


480 


acgactctct 


ggtatggagt 


ttggggcgct 


gtgttctcta 


tttttggact 


agctagtgct 


540 


ttgctagtga 


agaaaaaaat 


aggagctatt 


ttcatcccag 


ttgcctatat 


gatggttggt 


600 


ggtatttttt 


gggctatttt 


agggctatct 


tacttagaac 


ctgtgacaac 


gctagctttg 


660 


ggatatcaga 


aagatatcag 


tctttcctta 


gttagtgctc 


atcttgcttt 


tattttattt 


720 


gttagttgtt 


tggttgttta 


tggtacattt 


tttctacatt 


cagaggacta 


tgtataa 


777 



<210> 57 
<211> 777 
<212> DNA 
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<213> Streptococcus pneumoniae 
<400> 57 



ttatctattt 


tatattttga 


atcgagggaa tggagatata 


ggacgagatt tatttcttca 


60 


gttgctattt 


ttagtctttc 


tttttcaaag cattttctat 


ttcactcgtc aaaagaggag 


120 


gtttatagaa 


tgaatgaaat 


aattacatta aaaaatattg 


agttgaaatt aaaaaagaca 


180 


tgtgtttttc 


aaaaccttaa 


ctttagttgt aaacaggggg 


aaattatagg aattactggt 


240 


gcgaatggct 


cagggaaaag 


tgtattgttt aaattaatag 


ctggtttata tagtccgtct 


300 


tatggagaag 


tgttaatcaa 


tggggaaaat attgttcctg 


agagaaaaat tccagctaat 


360 


ttgggagctt 


tgattgaaga 


acctggtttt ataaattatt 


atagtggctt taagaattta 


420 


caatatttgg 


caagcatacg 


aggagtagtt ggtaatcagg 


aaatcaatga tacactgaaa 


480 


atagttggtc 


tatatgagca 


aaaagaccag aaagttaaaa 


cttattcgct aggtatgagg 


540 


aaaaagctag 


ggattgctca 


agcaattatg gagaatccct 


ctattctttt actagatgaa 


600 


cctatgaatg 


ccttggataa 


atcaagtgta gaaaatatga 


gaacattgtt tagaaagctc 


660 


tctagtgaaa 


aagggacaac 


aattttgatt gctagtcata 


gtgaagagga tattcgtatc 


720 


ttatgtgata 


aagtatatgc 


aatagaagat aaagtatgta 


cactgtgttc agattga 


777 



<210> 58 

<211> 759 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 58 



atgtctgaaa 


ctatcttaga 


aatcaaggaa ctaaaaaaat 


ccttcggaga caatcccatc 


60 


ctccaaggac 


tttctctaga 


aatcaaaaaa ggggaagttg 


ttgtcatcct agggccatct 


120 


ggttgtggga 


aaagtaccct 


ccttcgttgc ctcaacggct 


tagaaagtat tcaaggtgga 


180 


gatattcttc 


tggatggtca 


gtctatcgtt gaaaataaaa 


aagattttca cctagttcgc 


240 


caaaagattg 


gcatggtctt 


tcaaagttat gaactctttc 


cccatctgga tgtcttacaa 


300 


aacctcatcc 


taggccctat 


caaagctcaa ggaagggaca 


agaaagaagt aacggaagaa 


360 


gctttgcaat 


tactagagcg 


tgtcggtttg ctggataaac 


aacatagctt tgcccgtcaa 


420 


ttatctggtg 


gacagaagca 


acgtgttgca attgtccgtg 


ccctcctaat gcatccagaa 


480 


atcatccttt 


ttgacgaggt 


gactgcttcg ctggatccag 


aaatggtgcg tgaggtgctg 


540 


gaacttatca 


atgatttggc 


ccaagaaggc cgtaccatga 


ttttagtaac ccacgaaatg 


600 
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cagtttgccc aagccattac tgaccggatt 

gaaggaacag ctcaagcctt ctttaccaat 

aacgtctttg actttagcca attcggctca 

<210> 59 
<211> 672 
<212> DMA 

<213> Streptococcus pneumoniae 
<400> 59 

tccattgttg aacaatatct accactatat 

gcagtttggg gaattttggg atcctttctg 

tatcgaatcc ttgttttggc gcaagtagcg 

ccccttttga ttcaactctt ctttctctac 

tcttcagaag tctgtgcaac gcttgggctt 

tctttccgaa gtgggctgga agccatcagt 

ggtctgacac ctctacaggt cttttactat 

ctcccctcct ttagtgccaa tgtcattttc 

gtggctttgg ccgacctcat gtacgtcgcc 

gacattgcgc tagctatgtt ggtagttgct 

gtctttagct ggatagaaag gaggctccgc 
tcaaggaaat aa 

<210> 60 
<211> 1386 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 60 

atgggtctgg aactacgagc gattcagtcc 

tttcatgcgc aagcctttac cttgttagtt 

tttcaaatga ttgcccaagt tagttctctt 

agcgaggtca gtcagctttc tatcgtcgaa 

aatcctaatc atcaatttac catggagagc 



atcttcctcg accaagggaa aatcgctgaa 660 
ccgcaaacca aacgagccca ggaattttta 72 0 
tatctataa 759 



caaaaggcat 


tctttctgac 


cttgcatatt 


60 


ctcggtttaa 


tcgttagtat 


catccgacat 


120 


acagcctaca 


ttgaattgtc 


acgtaatacg 


180 


ttcggtcttc 


cccgaatcgg 


gattgtccta 


240 


gtctttttag 


gaggctccta 


tatggcagaa 


300 


caaacccagc 


aggagattgg 


cctcgctatt 


360 


gtggttcttc 


cgcaagcaac 


agcggtggca 


420 


cttatcaagg 


aaacctctgt 


tttctcagca 


480 


aaggatttga 


ttggtctcta 


ctatgagaca 


540 


tatctaatca 


tgctgctacc 


catctcactg 


600 


catgcaggat 


tcgggaatcc 


aagtactctt 


660 
672 



ccaatcttct 


ctgagccgtt 


tgattttact 


60 


gggagtagtg 


ggtctggaaa 


atccagcctc 


120 


ccctatagcg 


gtcaagtcct 


gatagatggg 


180 


cgtgtccaga 


cggttggtat 


tctcttgcaa 


240 


ttgtttgagg 


agttggtttt 


taccatggaa 


300 
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aatatcggct 


atcaccttca 


ggaaattgat tctaaaatag cagaggttgt ccagcaatgt 


360 


cgttgcaagg 


acatcttgca 


ccgtctcatc catcacttat caggtgggga aaagcaaaaa 


420 


gcagcgctgg 


ctgtcctctt 


tgccatgaat cctagggtct atctcttgga tgagcccttc 


480 


gcttccattg 


accgcaagag 


cagaatcgag atattggaga ttctaaaaga gtfcggtctat 


540 


gatgggaaga 


cagttatttt 


gtgcgaccat gafcttatctg actataaagc ctatatcgac 




catatggtgg 


agctaagaga 


cggaaaacta agggaagtgt ttcaaatccc ttcctatgag 




atgacacagg 


ttgcttcaaa 


ggaagttgct tcfcagcccgg aacfcattcca tatgaaccgt 


720 


gtgactggtg 


agcttggtaa 


tcgccccctc ttttcaattg ctgatttcac attctatcaa 




gggatttcct 


gtatcctggg 


tgacaatggfc gtcgggaaat caaccctctt tcggtctatt 




cttcaatttc 


aaaagtataa 


ggggagcatt acttggaagg gttcggtcct gaaaaagaaa 


900 


aagagtttgt 


atcgtgatct 


gactggtgtt gttcaggaag ctgagaagca gtttatccga 


960 


gtcagtctgc 


gagaggagct 


tcaafctagat crgacctgafcfc cfcgaaagaaa tcagcggatt 




tttcaagctt 


tacgatattt 


tgatttggag caggcagtcg ataagagtcc ctatcaatfca 




agtggtggtc 


agcaaaaaat 


tcttcagctc ctgaccatct tgaccagtaa ggcttccgtg 


1140 


atcttgctag 


atgaaccttt 


tgcaggtttg gatgatagag cctgccatta tttttgcaag 


1200 


tggattgtgg 


aggagaggaa 


tcaaggaaga agttttctgc tcattagtca tcgtttagac 


1260 


cctttgattt 


ctgtggttga 


ttattggatt gagatgacta gtcagggtct tcggcatgtg 


1320 


aaagaagtga 


ccattaccaa 


accacttaca tctcagagta gcaataccca aggggaggtg 


1380 


agatag 






1386 



<210> 61 
<211> 1212 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 61 

ccatcatctc ttgttttttt gtggtacaat agagctatga aacattttga tactattgtc 60 
atcggtgggg gacctgctgg tatgatggct acgatttcca gtagctttta tggacagaaa 120 
accctcctca tcgaaaaaaa tcggaaactt ggaaaaaaat tagctgggac tggtggggga 180 
cgttgcaatg tgaccaacaa tggtagctta gacaacctgc tagctggaat tcctggaaac 240 
ggacgctttc tttacagtgt tttctcccag ttcgataatc atgacatcat caactttttt 300 
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acagaaaatg 


gtgttaaact 


taaggtcgaa gaccacggac gcgtctttcc 


agccagtgac 


360 


aagfcctcgga 


ctattatcga 


agctttggaa aagaaaatca ccgaactagg 


tggtcaagtt 


420 






ttctgttaaa aaagtagatg accagtttgt 


ccttaagtca 


480 






tgagaaactc attgtcacaa caggtggtaa gtcttatcct 


540 






tggtcacgag attgctcgcc attttaagca 


taccatcacc 


600 






tcctttatta acagattttc cacataaagc 


cttacaaggt 


660 






cctaagttat ggtaagcatg tcatcactca 


tgatttactc 


720 




ttggtttgtc 


aggtcctgct gccctacgca tgtctagctt 


tgtcaaaggt 


780 


ggggaggttc 




tgttttgcct caactttctg agaaggactt ggttacattt 


840 






atccttgaaa aacgctttaa aaaccttgtt 


accagaacgc 


900 




tttttgtaca 


aggatatcct gaaaaagtca aacaactgac 


tgaaaaggaa 


960 


cgagaacaac 


ttgtccagtc 


cattaaagaa cttaaaattc ctgtaactgg 


aaaaatgtcc 


1020 


cttgcaaagt 


cctttgttac 


caagggtgga gtcagtctca aggaaatcaa 


tcctaaaacc 


1080 


cttgaaagta 


agctggtacc 


tggcctccac tttgcaggcg aagttatgga 


tatcaatgcc 


1140 


cacacgggtg 


gctttaacat 


cacttctgcc ctctgtaccg gctgggtggc 


gggaagtctg 


1200 


cattatgatt 








1212 



<210> 62 
<211> 264 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 62 

ggagacaaaa agatgaagaa aaaatttgcc ctatcgtttg tggcgcttgc aagtgtagca 60 
cttcttgcag cctgtggaga agtgaagtct ggagcagtca acactgctgg taactcagta 120 
gaggaaaaga caattaaaat cgggtttaac tttgaagaat caggttcttt agctgcatac 180 
ggaacagctg aacaaaaagg tgcccaattg gctgttgatg aaatcaatgc cgcagtggta 240 
tcgatggaaa acaaatcgaa gtag 264 

<210> 63 

<211> 783 

<212> DNA 

<213> Streptococcus pneumoniae 
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<400> 63 






















saggggaact 


ggttggatta 






acggagctgg 






t tttgaccgg 




















attgcctctt 


ggg&c 1 1 gg 


acgtactttc 


caaaatatcc 


gfcctctttaa 


agatttaaca 






atgttttgat 


tgcttttgga 


aaccatcaca 


aacagcatgt 


ttttactagt 




ttcttacgct 


taccagcttt 


t tacaagagt 


gaaaaagaat 


taaaggctaa 


agctttggaa 






tctttgattt 


agatggtgat 


gcagagac t c 


ttgctaaaaa 


tctttcctac 




ggacaacaac 


gtcgtttgga 




gcccttgcta 


cggaacctaa 


aattctcttc 




ttagatgaac 


cagcagcagg 


tatgaaccca 


caggaaacag 


ccgaattgac 


tgagttaatt 


600 


cgtcgtatca 


aagatgagtt 


taagattaca 


atcatgttga 


ttgaacacga 


tatgaatctg 


660 


gtcatggaag 


taacagaacg 


tatctacgta 


cttgaatatg 


gccgtttaat 


cgctcaagga 


720 


actccagacg 


aaattaagac 


caataaacgc 


gtfcatcgaag 


cttafcctagg 


aggtgaagcc 




taa 












783 


<210> 64 
<211> 705 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 64 
aaaggaactc 


acatgtcaat 


tattgaaatg 


agagatgtcg 


fc t aaaaaa ta 


cgacaacgga 




acaactgctc 


tacgcggtgt 


ttcggttagc 










ggaccttcag 


gagcagggaa gtcaactttt 




tgfcatcgtga 






gataaaggaa 


gcctatcagt 


tgctggtttt 










ccgcttctac 


gtcgtagtgt 


tggggttgtc 


ttccaggatt 


ataaattgtt 






actgtctatg 


aaaatattgc 


ttacgctatg 


gaagtaatcg 


gggaaaatcg 


ccgtaatatc 


360 


aaaagacgag 


tgatggaagt 


tttggacttg 


gttggattga 


agcataaggt 


tcgttctttc 


420 


ccaaatgaac 


tctcaggtgg ggagcaacag 


cggattgcga 


ttgcgcgtgc 


aattgtaaat 


480 


aatcccaaag 


tattgatagc 


tgatgagcca 


acaggaaatc 


tggatccgga 


taattcatgg 


540 


gaaattatga 


atctcttgga acggattaac 


ctacaaggaa 


caactatttt 


gatggcgact 


600 
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cataatagcc agattgtaaa taccttgcgc caccgtgtca ttgccattga aaatggccgt 660 
gtcgttcgtg acgaatcaaa aggagagtat ggatacgatg attag 705 

<210> 65 

<211> 2181 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 65 









tctttcgaat 


tttggcaaaa 


attcggtaag 




gctttgatgg 


tagttatcgc 


ggttatgccg 


gctgctggtt 


tgatgatttc 


aatcggtaag 




tctatcgtga 


tgattaaccc 


aacctttgca 


ccacttgtca 


tcacaggtgg 


aattcttgag 




caaatcggtt 


ggggggttat 


cggtaacctt 


cacattttgt 


ttgccctagc 


cattggagga 




agctgggcta 


aagaacgtgc 


tggtggtgct 


ttcgccgctg 


gtcttgcctt 


catcttgatt 
























aaagttgctg 




cagtgttctt 




gaagctccag 


ccttgaacat 


ggggcjtattc 


gtagggatta 


tctcaggttt 


tgtaggggca 




actgcttaca 


acaaatacta 


caacttccgt 


aaacttcctg 


atgcactttc 


attcttcaac 




gggaaacgtt 


tcgtaccatt 


tgtagttatt 


cttcgttcag 


caatcgctgc 


aattctactt 




gctgctttct 


ggccagtagt 


tcaaacaggt 


atcaataact 


tcggtatctg 


gattgccaac 




tcacaagaaa 


ctgctccaat 


tcttgcacca 


ttcttgtatg 


gtactttgga 


acgtttgctc 


720 


ttgccatttg 


gtcttcacca 


catgttgact 


atcccaatga 


actacacagc 


tcttggtggt 


780 


acttatgaca 


ttttaactgg 


tgcagctaaa 


ggtactcaag 


tattcggtca 


agacccacta 


840 


tggcttgcat 


gggtaacaga 


ccttgtaaac 


cttaaaggta 


ctgatgctag 


tcaatatcaa 


900 


cacttgttag 


atacagtaca 


tccagctcgt 


ttcaaagttg 


gacaaatgat 


cggttcattc 


960 


ggtatcttga 


tgggtgtgat 


tgttgctatc 


taccgtaatg 


ttgatgctga 


caagaaacat 


1020 


aaatacaaag 


gtatgatgat 


tgcaacagct 


cttgcaacat 


tcttgacagg 


ggttactgaa 


1080 


ccaatcgaat 


acatgttcat 


gttcatcgca 


acacctatgt 


atcttgttta 


ctcacttgtt 


1140 


caaggtgctg 


ccttcgctat 


ggctgacgtc 


gtaaacctac 


gtatgcactc 


attcggttca 


1200 


atcgagttct 


tgactcgtac 


acctattgca 


atcagtgctg 


gtattggtat 


ggatatcgtt 


1260 


aacttcgttt 


gggtaactgt 


tctctttgct 


gtaatcatgt 


actttatcgc 


aaacttcatg 


1320 
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attcaaaaat 


tcaactacgc 


aactccaggg cgcaacggaa 


actacgaaac 


tgctgaaggt 


1380 


tcagaagaaa 


ccagcagcga 


agtgaaagtt gcagcaggct 


ctcaagctgt 


aaacattatc 


1440 


aaccttcttg 


gtggacgtgt 


aaacatcgtt gatgttgatg 


catgtatgac 


tcgtcttcgfc 


1500 


gtaactgtta 


aagatgcaga 


taaagtagga aatgcagagc 


aatggaaagc 


agaaggagct 


1560 


atgggtcttg 


tcatgaaagg 


acaaggggtt caagctatct 


acggtccaaa agctgacatt 


1620 


ttgaaatctg 


atatccaaga 


tatccttgat tcaggtgaaa 


tcattcctga aactcttcca 


1680 


agccaaatga 


ctgaagcaca 


acaaaacact gttcacttca 


aagatcttac 


tgaggaagtt 


1740 


tactcagtag 


cagacggtca 


agttgttgct ttggaacaag 


taaaggatcc 


agtatttgct 


1800 


caaaaaatga 


tgggtgatgg 


atttgcagta gaacctgcaa 


atggaaacat 


tgtatctcca 


1860 


gtttcaggta 


ctgtgtcaag 


catcttccca acaaaacatg 


cttttggtat 


tgtgacggaa 


1920 


gcaggtcttg 


aagtattggt 


tcacattggt ttggacacag 


taagtcttga aggtaaacca 


1980 


tttacagttc 


atgttgctga 


aggacaaaaa gttgcagcag 


gagatctcct 


tgtcacagct 


2040 


gacttggatg 


ctatccgtgc 


agcaggacgt gaaacttcaa 


cagtagttgt 


cttcacaaat 


2100 


ggtgatgcaa 


ttaaatcagt 


taagttagaa aaaacaggtt 


ctcttgcagc 


taaaacagca 


2160 


gttgctaaag 


tagaattgta 


a 






2181 



<210> 66 
<211> 1551 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 66 

ggaattaaaa tgagtatttt agaagttaaa aatctgagtc acggttttgg tgaccgtgca 60 
atttttgaag atgtgtcctt ccgtctcctc aagggagaac atatcggcct ggtcggtgcc 120 
aatggtgaag gaaaatcaac ctttatgagt atcgtgactg gtaaaatgct gccagatgaa 180 
ggaaaggttg agtggtccaa atatgtgacg gctggttact tggatcagca ctctgtcctt 240 
gctgaaagac agtcggtgcg tgatgttctc cgtacggctt ttgatgagct tttcaaagct 300 
gaagctcgta tcaatgacct ctatatgaaa atggctgaag acggcgcgga tgttgatgct 360 
ctcatggaag aagtaggaga acttcaagac cgtctggaga gtcgtgattt ctataccttg 420 
gatgctaaga ttgacgaagt agcgcgtgct cttggtgtta tggactttgg catggatacg 480 
gatgtaactt ctttgtcagg tgggcaaaga accaaggtgc ttttggcaaa acttctcctt 540 
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gaaaagcctg 


atatcttgct 


gttggacgag 


ccgaccaact 


acttggatgc 


tgagcatatt 


600 


gattggctca 


agcgctatct 


ccaaaactat 


gagaatgcct 


ttgttctcat 


ttcgcacgat 


660 


attccattcc 


tcaatgacgt 


tattaatatt 


gtctatcatg 


tggaaaatca 


acagctgacg 


720 


cgttactctg 


gtgactacta 


ccagttccaa 


gaagtttatg 


ctatgaagaa 


atctcagcta 


780 


gaggcagcct 


acgaacgcca 


gcagaaagag 


attgcagacc 


tcaaggactt 


tgtggctcgt 


840 


aataaagccc 


gtgttgcaac 


tcgtaatatg 


gctatgtctc 


gtcaaaagaa 


attggataag 


900 


atggatatta 


tcgaactcca 


aagtgagaaa 


ccaaaaccat 


cctttgattt 


caaaccagct 


960 


cgtacaccag 


ggcgctttat 


cttccaagcc 


aagaacttgc 


aaattggtta 


cgaccgtcct 


1020 


cttactaagc 


ctttaaatct 


taccttcgaa 


cgcaatcaaa 


aggttgcgat 


tattggtgct 


1080 


aatggtattg 


gaaaaacaac 


tctcttgaag 


agtctcttgg 


gcattatctc 


gccaatcgct 


1140 


ggggaagtgg 


agcgtggaga 


ttatttagaa 


cttggttatt 


ttgagcagga 


agtagaaggc 


1200 


ggtaatcgcc 


aaactcctct 


tgaagctgtc 


tggaatgcct 


ttcctgccct 


taatcaagca 


1260 


gaagtccgtg 


cagcccttgc 


ccgttgtggt 


ttgacaacca 


aacatattga 


aagccagatt 


1320 


caagtattat 


cagggggaga 


gcaagccaag 


gttcgtttct 


gtctcttgat 


gaatcgtgaa 


1380 


aacaacgttt 


tagtgctgga 


cgagccgacc 


aaccatttgg 


atgtggatgc 


aaaggatgag 


1440 


ctcaaacgcg 


ctctcaaaga 


atatagggga 


tctatcctta 


tggtctgcca 


cgagccagac 


1500 


ttttatgaag 


gctggataga 


ccaaatatgg 


gattttaata 


atttaactta 


a 


1551 



<210> 67 

<211> 822 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 67 



atttttgaga ggatcagaat 


gaaaaaacta 


gcaacccttc 


ttttactgtc 


tactgtagcc 


60 


ctagctgggt gtagcagcgt 


ccaacgcagt 


ctgcgtggtg 


atgattatgt 


tgattccagt 


120 


cttgctgctg aagaaagttc 


caaagtagct 


gcccaatctg 


ccaaggagtt 


aaacgatgct 


180 


ttaacaaacg aaaacgccaa 


tttcccacaa 


ctatctaagg 


aagttgctga 


agatgaagcc 


240 


gaagtgattt tccacacaag 


ccaaggtgat 


attcgcatta 


aactcttccc 


taaactcgct 


300 


cctctagcgg ttgaaaattt 


cctcactcac 


gccaaagaag 


gctactataa 


cggtattacc 


360 


ttccaccgtg tcatcgatgg ctttatggtc 


caaactggag 


atccaaaagg 


ggacggtaca 


420 
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ggtggtcagt ccatctggca tgacaaggat aagactaaag acaaaggaac tggtttcaag 480 

aacgagatta ctccttattt gtataacatc cgtggtgctc ttgctatggc taatactggt 540 

caaccaaaca ccaatggcag ccagttcttc atcaaccaaa actctacaga tacctcttct 600 

aaactcccta caagcaagta tccacagaaa attattgaag cctacaaaga aggtggaaac 660 

cctagtctag atggcaaaca cccagtcttt ggtcaagtga ttgacggtat ggatgttgtg 720 

gataagattg ctaaggccga aaaagatgaa aaagacaagc caactactgc tatcacaatc 780 

gacagcatcg aagtggtgaa agactacgat tttaaatctt aa 822 

<210> 68 

<211> 1368 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 68 





aggagttttc 






ttgctatcgg 






ggaggaatcg 


ctaccatgaa 


ccgtgctggt 


gaacatggag 


ccaaagcagc 


cgttattgag 


120 


gaaaagaaat 


taggtggaac 


ctgtgtcaac 


gtcggttgtg 


ttcctaaaaa 


aatcatgtgg 


180 


tacggggcgc 


aaatcgctga 


gactttccat 


caatttggag 


aagactacgg 


ctttaagact 


240 


actgatctta 


actttgactt 


tgcaacccta 


cgtcgcaatc 


gtgaagccta 


cattgatcgc 


300 


gctcgttctt 


cttatgatgg 


tagttttaaa 


cgcaacggtg 


tagacttgat 


tgaaggtcat 


360 


gctgaatttg 


tagattctca 


tactgtaagc 


gtaaatggtg 


aactgattcg 


tgctaaacat 


420 


atcgtgattg 


ctacaggtgc 


ccatccaagt 


attcctaata 


ttcctggtgc 


tgagctaggt 


480 


ggctcttctg 


atgatgtatt 


tgcctgggaa 


gaacttccag 


agtcaattgc 


cattctaggc 


540 


gctggttata 


tcgccgttga 


attagctggc 


gtactccaca 


cttttggtgt 


caagacagat 


600 


ctctttgttc 


gccgcgatcg 


tcctttacgt 


ggttttgatt 


cctacatcgt 


tgaaggtttg 


660 


gtcaaggaaa 


tggaaagaac 


aaacttacca 


cttcacactc 


acaaagtccc 


tgtcaagtta 


720 


gaaaaaacta 


ctgacggcat 


taccattcat 


ttcgaagatg 


gtactagtca 


cacagctagc 


780 


caagttatct 


gggctacagg 


tcgccgtcca 


aacgttaagg 


gcttgcaact 


tgaaaaagct 


840 


ggagtgactc 


tgaacgaacg 


tggctttatc 


caagtggatg 


aataccaaaa 


tactgttgtt 


900 


gagggaatct 


atgctctagg 


tgatgtaacg 


ggcgagaaag 


aactgactcc 


agttgcaatc 


960 


aaggccggac 


gtaccctatc 


tgaacgtctc 


tttaacggaa 


aaactactgc 


aaaaatggat 


1020 
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tactcaacta ttccaactgt 


tgtcttttca 


caccctgcta 


tcggaactgt 


tggtttgaca 


1080 


gaagagcaag ctattaaaga atacggtcaa 


gaccaaatca 


aggtttataa atcaagcttt 


1140 


gcatctatgt actctgcttg 


cacttgcaac 


cgtcaagaat 


cccgtttcaa 


actcataaca 


1200 


gctggttcag aagaaaaagt 


tgtcggactt 


catggaattg 


gctacggcgt 


tgatgaaatg 


1260 


attcagggat ttgccgttgc 


tatcaaaatg 


ggagcaacca 


aggctgactt 


tgatgcaact 


1320 


gtggcaattc acccaactgc 


atctgaagaa 


tttgtaacca 


tgcgttaa 




1368 



<210> 69 

<211> 1338 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 69 



aagatgttca 


gtaaacttaa 


aaaaacatgg 


tatgcggatg 


actttagtta ttttatccgc 




aacttcggtg 


tcttcaccct 


gattttttct 


acaatgactc 


tgattatttt acaagtcatg 




cattcgagtc 


tttatacttc 


ggtggacgat 


aagcttcatg 


gactgagtga aaatcctcaa 




gcagttattc 


agctggctat 


aaatagggca 


acagaagaga 


ttaaagattt agaaaatgct 


240 


agggcggacg 


ctagtaaagt 


agaaataaaa 


cctaatgtca 


gttccaatac ggaagtcatt 


300 


ctctttgata 


aagactttac 


tcaacttctt 


tctggaaatc 


gatttttggg cttggataag 


360 


attaagttag 


aaaagaaaga 


actaggacat 


atctaccaga 


ttcaggtttt taatagctat 


420 


gggcaggaag 


aaatctatcg 


tgtgattttg 


atggagacca 


atattagttc ggtttcaacc 


480 


aatatcaagt 


atgctgctgt 


cttgattaat 


accagtcagt 


tggaacaggc tagtcaaaag 


540 


catgagcaat 


tgattgtggt 


cgtgatggct 


agtttctgga 


ttttgtcttt acttgccagt 


600 


ctctatctag 


ctagggtcag 


tgttaggccc 


ctgcttgaga 


gtatgcagaa gcaacagtct 


660 


tttgtggaaa 


atgccagtca 


tgagttacga 


actccactcg 


cagttttgca aaatcgctta 


720 


gagacccttt 


ttcgtaagcc 


agaagctacc 


attatggatg 


tgagcgaaag cattgcatcg 


780 


agtttggaag 


aagtccgaaa 


tatgcgtttt 


ttaacgacaa 


gcttgctgaa cttagctcgg 


840 


agagatgatg 


ggattaagcc 


ggagcttgca 


gaagttccaa 


ctagcttttt taatacaact 


900 


ttcacaaact 


acgagatgat 


tgcttcggaa 


aataatcgtg 


tcttccgttt tgaaaatcgt 


960 


atccatcgaa 


caattgtcac 


agatcagctt 


cttctgaaac 


aactgatgac cattcttttc 


1020 


gataatgccg 


tcaagtatac 


tgaggaggat 


ggtgaaattg 


attttcttat ctcggcgacc 


1080 
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gatcgcaatc tttatttact tgtttctgat aatggaatcg gtatttcgac agaagataaa 1140 

aagaaaattt ttgaccgttt ttatcgagta gacaaggcta gaacccggca aaaaggtggt 1200 

tttggtttag gattatccct agccaagcaa attgtagatg ctctaaaagg aactgttact 1260 

gtcaaagata ataaacccaa gggaacaatc tttgaagtga agattgccat tcagacacca 132 0 

tctaaaaaga aaaaataa 1338 

<210> 70 

<211> 1092 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 70 



gattgtaatt 


ttcttacggg 


catgattctc tccttaacag 


cacauaccua uuuuaucatu 




ttttcggcag 


agaattatta 




taaagtccct tttcattttc 




aaagcatggc 


tgafctttgga 


gaaatgtggt ataatttttc 


ttatggaaaa gattgtcatt 




acagcaactg 


ctgaaagtat 


tgaacaagtt gaacaactac 


tcgaagctgg cgtagaccgt 




atctatgtcg 


gtgagaaaga 


ttttggtctt cgtctgccaa 


cgacctttag ttatgaccaa 




ttacgtgaaa 


tcgctaagtt 


ggttcatgat gctggtaagg 


aattgatcgt tgcggtcaat 


360 


gctctcatgc 


accaagatat 


gatggaccgt atcaagcctt 


tcttaaactt cttggaagaa 


420 


atcaagacag 


actatattac 


gattggggat gcaggcgtct 


tttacgtagt taaccgcgat 


480 


ggttattcat 


ttaagaccat 


ctacgatgct tcaaccatgg 


taactagcag tcgtcagatt 


540 


aacttctggg 


gacaaaaggc 


tggcgcatct gaggctgttt 


tggcgcgtga aattccatca 


600 


gctgaacttt 


tcaaaatgcc 


agagattttg gaaattcctg 


ctgaagtttt ggtttacggt 


660 


gctagcgtca 


tccatcattc 


taaacgtcca ctcttgcaaa 


actactataa ctttacacat 


720 


atcgatgatg 


aaaagacgca 


taaacgtgac ctcttcttgg 


ctgagccaag tgatccagag 


780 


agccactatt 


ccatttttga 


agataatcat gggacccata 


tctttgccaa caatgacctt 


840 


gatttgatga 


tcaaattaac 


agaattggtg gagcatggct 


ttactcgctg gaaactagaa 


900 


gggctctaca 


ctcctggtca 


gaactttgtt gagattgcaa aactctttat ccaagcgcgt 


960 


agcttgattc 


aagagggcaa 


ctttagtcat gctcaagcct 


tcttgctgga tgaagaagtt 


1020 


cgtaaacttc 


accctaaaaa 


ccgtttcctt gatacaggat tttatgacta cgatcctgac 


1080 


atggttagat 


aa 






1092 
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<210> 71 

<211> 765 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 71 

















tatgaatttt tcttttttac ctaagtattt 


accttatttt 








gattcttatt tctatctgtg ttatcttttt 








tcttggcttt 








aacttgtacg 


tttggatttt 


ccgtgggaca ccgatgatgg ttcaaattat 


gattgccttt 


300 


gctcttatgc 


atatcaatgc 


tccgactatt cagattggaa ttttaggtgt 


tgatttttcg 


360 


cgtctgattc 


cagggatttt 


gattatctct atgaatagtg gtgcttatgt 


ttcggagact 


420 


gttcgtgccg 


gaatcaatgc 


ggttccaaaa ggtcagctag aagcggctta 


ttcgctaggg 


480 


attcgtccta 


aaaatgcgat 


gcgttatgtg attttgccac aagcagtcaa 


aaatatcttg 


540 


ccagcattgg 


ggaacgaatt 


tatcaccatt atcaaggaca gctccctctt 


atcagctatt 


600 


ggggtcatgg 


agttgtggaa 


tggggctaca acagtttcta caacaaccta 


tctaccttta 


660 


acaccacttt 


tatttgcagc 


attttactac ttgattatga cctctattct 


gacagtagcc 


720 


ttgaaagctt 


ttgaaaaaca 


tatgggacaa ggagataaga aataa 




765 



<210> 72 
<211> 741 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 72 

gaaataatga cagaaacctt gataaaaatt gaaaatttac ataaatcctt tggaaagaat 60 
gaagtattga agggcatcaa cctcgagatt aaaagaggag aagttgtcgt tatcatcggt 120 
ccttcaggga gcgggaaatc taccttgctt cgctctatga atttgttgga agaagcaacc 180 
aaggggaagg ttatctttga gggagtcgat attacggaca agaagaatga cctgtttgcc 240 
atgcgtgaga agatgggcat ggtttttcaa caattcaatc tctttcctaa tatgactgtg 300 
atggaaaata tcaccttgtc ccctatcaag accaaaggtg acagtaaggc cgttgcagag 360 
aaaagagctc aggaactttt ggaaaaagtt ggtttgccag ataaggcaga cgcttatcca 42 0 
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cagagtttgt 


caggtggcca gcaacagcgg 


attgccatcg 


cgcgtgggtt 


ggctatggaa 


480 


ccagatgttt 


tgctctttga cgagccaact 


tcagccctag 


atcctgagat 


ggttggagaa 


540 


gttctggctg 


ttatgcaaga tctagccaag 


tcaggaatga 


ccatggttat 


cgtaacacat 


600 


gagatgggat 


ttgcccgtga ggtggcagat 


cgtgtcatct 


ttatggcaga 


cggtgtggtt 


660 


gttgaagacg gaacacctga gcagattttt 


gaacaaaccc 


aaggacaaag 


gactaaagac 


720 


ttcttgagta 


aggttttata a 








741 



<210> 73 

<211> 261 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 73 



ttcacaactt ataggaggtg 


tactatgaaa 


atcttaaaac 


gttacatatt 


ggaactctgt 


60 


tttattttaa gttttgcttt 


accttttata 


aaaggaacca 


atgcagataa 


tggtagatgc 


120 


tttgtggaaa cctattacgg ttttactttt 


ttgatggaac 


atgctattgt 


aacagctgtc 


180 


tttatttgtt cgttcttaat 


tgctttctta 


ctaaaaaacg 


atggacgaaa 


tggattgctg 


240 


cgggtagtta ttgcttttta 


g 








261 



<210> 74 

<211> 1548 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 74 



aaggaagagc 


acatggcaca cgaaaatgtc 


attgagatgc 


gtgatattac 


caaggtgttt 


60 


ggtggatttg 


ttgccaacga 


caaaatcaac 


ttgcacctac 


gaaaaggtga 


aattcatgca 


120 


cttttaggag 


aaaatggggc 


tggtaagtcc 


acgctaatga 


acatgttagc 


aggccttctt 


180 


gaaccaacta 


gtggtgaaat 


cgcggtcaac 


ggtcaagttg 


tcaatctcga 


ctccccatct 


240 


aaagcagcta 


gcttgggaat 


cgggatggtt 


caccagcact 


ttatgttggt 


tgaagccttc 


300 


acagtggctg 


aaaacatcat 


tttaggtagt 


gaattgacta 


aaaatggtgt 


gctagatatc 


360 


gctggagcta 


gcaaagaaat 


caaggctctt 


tctgaacgtt 


atggcttagc 


tgttgaccct 


420 


tctgccaagg 


tagcagatat 


ctcagttgga 


gcccaacaac 


gtgtagaaat 


tttaaaaaca 


480 


ctttatcggg 


gggctgatat 


ccttatcttt 


gacgaaccaa 


cggctgtttt 


gactccatca 


540 


gaaattgatg 


agttgatggc 


tattatgaaa 


aatcttgtca 


aagaaggaaa 


atcaattatc 


600 
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ttgattaccc 


acaaattgga 


tgaaattcga 


gcagtttctg 


accgtgttac agttatccgt 


660 


cgtgggaaat 


caattgaaac 


cgttgaaatt 


gcaggggcta 


ccaatgctga tttggcggaa 


720 


atgatggtag 


gacgttctgt 


ttcctttaaa 


acagagaagc 


aagcctctaa accaaaagaa 


780 


gtggttttgt 


ctatcaaaga 


tttggtggtc 


aatgaaaacc 


gtggtgttcc agctgttaaa 


840 


aatctgtcct 


tggatgttcg 


tgctggagag 


attgttggta 


ttgcggggat tgatggaaat 


900 


ggtcagtctg 


aactgattca 


agccattaca 


ggtcttcgta 


aggttgaatc tggtagcatt 


960 


gagctaaaag 


gagattcaat 


tgtaggcttg 


cacccacgtc 


agattacaga actaagtgtt 


1020 


gggcacgttc 


cagaagaccg 


tcaccgtgat 


ggcttgattt 


tggaaatgat gatatctgaa 


1080 


aatattgccc 


ttcaaaccta 


ctataaagaa 


ccacatagta 


aaaatggaat tttgaattat 


1140 


tcaaatatta 


cttcttatgc 


taaaaagctg 


atggaagagt 


ttgatgttcg cgctgccagt 


1200 


gaattagttc 


ctgcagctgc 


actctcagga 


ggaaatcaac 


aaaaagcaat tattgctcgt 


1260 


gaaattgatc 


gagatcctga 


tctccttatc 


gttagccagc 


caactcgtgg gttggatgtc 


1320 


ggtgccattg 


agtatatcca 


caaacgcttg 


attgaagagc 


gtgataatgg caaggctgtc 


1380 


cttgttgtca 


gctttgaatt 


ggatgagatt 


ttaaacgtct 


cagaccgtat tgccgttatc 


1440 


cacgatggta 


agattcaagg 


tattgtatca 


ccagaaacaa 


ccaataaaca agaacttggt 


1500 


gtcttgatgg 


ctggtggaaa 


cttgggaaag 


gagaagagtg 


atgtctaa 


1548 



<210> 75 

<211> 939 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 75 



gggaggagaa caaaaatgac 


agagttggca 


aagcaactat 


tagagttgac 


ctatattgtg 


60 


attggttgtc aatttctcca 


tacagcctat 


tgtagttata 


aagataaaac 


aaacccagtt 


12 0 


cgacttggga catctgcatt 


ttggactcta 


ttgtctatta 


cgtttatagg 


tggttcctat 


180 


atgccaaata tgagtattgg 


tattattgta 


atcctattat 


cgctgttaac 


attgtttaag 


240 


caagtccgta tcggaacctt 


gccatcctta 


gatgaaatga 


aagccaatat 


tgaatctaac 


300 


aggttgaaaa ataaaatttt 


tattccagtt 


atgctgatgg 


caatacttgc 


gttggtctta 


360 


gcgcaaatga ttccagaatt 


tagcaagatt 


tcgattagcc 


ttgccgcctt 


gtttgctaca 


420 


atttctgttc ttgtgattac 


caatagtcac 


cctaagagtc 


tgttatcaga aaataatcga 


480 
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atgactcagc aagtttcaac 


aagtgggatt 


gttcctcaat 


tattaggggc 


tttgggggct 


540 


atttttactg tagcaggtgt 


tggtgatgtt 


atctctcatc 


tgattagcgg 


tattgttcct 


600 


tcagatagtc gctttatagg 


agttttggcc 


tatgttcttg 


gaatggttct 


attcacaatg 


660 


attatgggaa atgcttttgc 


agcattcacc 


gttattacag 


caggtgttgg 


agttcccttt 


720 


gtatttgctc tgggagctaa 


tccaattgtg 


gctggtgctc 


ttgccatgac 


agcaggttat 


780 


tgtgggacct tattgacccc 


aatggctgct 


aattttaacg 


ctctaccagc 


agcattgatg 


840 


gatatgaaag atcagaatgg 


cgttataaag 


gctcaagcag 


gtgttgctct 


agtaatgatt 


900 


gttattcaca tattcttaat 


gtactttctc 


gcattttag 






939 



<210> 76 

<211> 1113 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 76 



ctcatgtttc 


gtagaaataa 


attatttttt 


tggaccacag 


aaattttact 


cttaaccatc 


60 


atcttttacc 


tatggagaca 


gatggggtct 


ttgattaacc 


cttttgttag 


cgtgcttaat 


120 


acaattatga 


ttccattttt 


attagggggc 


tttttttatt 


atttgacaaa 


ccctattgtt 


180 


actttcttaa 


ataaagtctg 


taaactcaat 


cgtttgcttg 


gtattttaat 


taccttgtgt 


240 


actttggtct 


ggggaatggt 


cataggtgtt 


gtctatctct 


tacctatttt 


gattaatcag 


300 


ttatctagtt 


tgattatatc 


tagtcaaact 


atttatagtc 


gagtacaaga 


cttaatcata 


360 


gacttatcta 


attatcctgc 


gctccagaat 


ttggatgtag 


aagctacaat 


tcagcagtta 


42 0 


aacttatcct 


atgttgatat 


tcttcaaaat 


atcctaaata 


gcgtatcaaa 


tagtgtgggg 


480 


agcgtcttgt 


cagctcttat 


cagtactgtt 


ttgattttga 


ttatgactcc 


agtttttttg 


540 


gtttatttct 


tattagatgg 


acataaattc 


ttgcccatgc 


ttgaaagaac 


gattctaaag 


600 


agggatcgct 


tgcatattgc 


aggcttatta 


aagaatttaa 


atgcgacgat 


tgctcgctat 


660 


attagtggag 


tttcgattga 


cgcaatcatt 


ataggttgtt 


tggcttatat 


tggctatagt 


720 


attattggtt 


taaaatatgc 


tttagttttt 


gccatttttt 


ctggtgtagc 


caatttaatt 


780 


ccttatgtgg 


ggccaagtat 


tggtttgatt 


cctatgatca 


tcgcaaatat 


attcactgat 


840 


ccccatagac 


tgctgattgc 


agtgatttat 


atgcttgttg 


ttcagcaggt 


agatggcaat 


900 


atcttatatc 


ctcgaatcgt 


aggaagtgtt 


atgaaggttc 


atccaatcac 


gattttagtt 


960 



-62- 



WO 02/083855 



PCT/US02/11524 



ttacttttgt tgtcaagcaa tatctatggt gtagttggaa tgattgtcgc agtgccaacc 102 0 
tattctatct tgaaagaaat ttctaagttc ttatcccatt tgtatgaaaa tcataaaata 1080 
atgaaagaac gagaaagaga attagctaag taa 1113 

<210> 77 

<211> 1995 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 77 

















ggctgttctc 






fcccctaacct 


tcaatccaaa 


gattgcggaa 


atccgtggag gaaccaccat tcaagcaaca 




cttggatttg 


gtatgtttgt 


cgttaccctt 


gcgtcagcca ttatcgtcct ctatgccaat 










ctgggtatat atggcatgtt aggcttggag 




aagcgccatc 


taatcagtat 


gacctttaag 


gagttagtgg tatttgggat tctaactgtt 




ggagcgggta 












tgaaggttga 


gctggttgct 








ttggafctgat 


ttfccctaggc 


ctcatgttcc tgaatgctct tcgaatcgcc 




cgtatgaatg 


ccctccagct 


ctcgcgtgag 


aaagcaagcg gagagaaaag aggtcgcttc 


600 


ctacctctcc 


aaacgattct 


tggttccata 


agtttaggga ttggctatta tcttgccctt 


660 


acggtaaccg 


atcctcttac 


agccctaaca 


actttcttcc tagctgtttt gctggttatc 


720 


tttggtactt 


atctattgtt 


taatgcaggg 


attacagtct tcctacaaat cttaaagaaa 


780 


aacaagaaat 


actattacca 


acctaataac 


ctcatatctg tttccaactt gattttccgt 


840 


atgaagaaaa 


atgcggttgg 


actagcaacc 


atcgctattt tgtcaacaat ggttttggta 


900 


accatgtcag 


cagcgacaag 


cattttcaat 


tccgcagaaa gctttaaaaa agttctaaat 


960 


cctcatgatt 


ttggggtttc 


agggcaaaat 


gttgaaaaag aagatttgga caaactcttg 


1020 


agccagtttg 


caagtgacaa 


aggttatagt 


gtcaaagaga aagaagtact tcgttacagt 


1080 


aactttggta 


ttgcaaatca 


agaaggaacc 


aagttaacta tttttgaaaa aggacaaaac 


1140 


cgtgtccaac 


ccacaacagt 


tttcatggta 


tttgaccaaa aagattatga aaatatgact 


1200 


ggtcaaaaac 


tgtctctatc 


aggaaatgag 


gtcggtctct ttgccaaaaa tgacggactg 


1260 
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aaaggacaga 


aagctctaac 


tctaaatgat 


catcaatttt ctgtcaaaga 


agaatttaat 


1320 


aaagatttca 


ttgtgaacca 


tgttccaaat 


aagtttaata tcttgactac 


tgattacaat 


1380 


taccttgttg 


ttcctgattt 


acaagccttt 


ttggatcaat tcccagattc 


ggctatctat 


1440 


aatcagtttt 


acggtggtat 


gaatgtaaat 


gtcagtgaag aagaacaact 


caaggtcgct 


1500 


gaggagtatg 


aaaactacct 


caatcaattt 


aatgctcaat tagacacaga 


aggtagctat 


1560 


gtttatggta 


gcaatctagc 


agatgctagt 


tctcagatga gtgccctctt 


tggtggtgtc 


1620 


ttctttatcg 


gtattttcct 


atccattatc 


tttatggtcg gaactgttct 


ggtcatctac 


1680 


tacaaacaaa 


tttctgaagg 


ctacgaagac 


cgtgaacgct ttattatctt 


gcagaaagtc 


1740 


ggtttggacc 


aaaagcaaat 


caagcaaacc 


atcaacaaac aggttttaac 


tgttttcttc 


1800 


cttcctttgc 


tctttgcctt 


catacatctc 


gcctttgcct accatatgct 


tagcctgatt 


1860 


ttaaaagtga 


ttggtgtact 


ggatacgact 


atgatgttga ttgtgacctt 


gtctatctgc 


1920 


gctatcttcc 


tcatcgccta 


tgtgctgatt 


ttcatgatta cttcaagaag 


ttatcgcaag 


1980 


attgtgcaaa 


tgtaa 








1995 



<210> 78 

<211> 1290 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 78 



ggacatttta 


gaagaagagg 


aaggaaaaaa 


atgagtcgtt tactagttat 


tggttgtggg 


60 


ggcgttgccc 


aagttgctat 


ttcaaagatt 


tgtcaagata gcgaaacatt 


tacagagatt 


120 


atgattgcta 


gccgtaccaa 


gtcaaaatgc 


gatgacttga aagcgaagct 


agaaggcaaa 


180 


acaagtacta 


aaattgaaac 


tgcagcactt 


gatgctgaca aggttgaaga agtgattgcc 


240 


ctgattgaaa 


gctacaaacc 


agaagctgtt 


ttgaatgtag ctctgcctta 


tcaagattta 


300 


accattatgg 


atgcttgttt 


ggcaacaggt 


gttcactata tcgatacagc 


caactacgaa 


360 


gcagaagaca 


cagaagaccc 


tgagtggcgt 


gctatctacg aaaaacgttg 


taaggaactt 


420 


ggttttacag 


cctactttga 


ctactcatgg 


cagtgggctt atcaagagaa attcaaagaa 


480 


gcaggcttga 


ctgctcttct 


tggttctggt 


tttgacccag gtgtaactag 


tgtcttttca 


540 


gcttatgccc 


tcaaacacta 


ttttgatgaa atccattata tcgacatttt 


agactgtaat 


600 


ggcggtgacc 


acggttatcc 


atttgcaacc 


aactttaatc cagaaattaa 


tctccgtgag 


660 
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gtttctgcgc 


caggttctta 


ctgggaagat 


atcaagcgtg 


agtatgattt 


ccctcaagtt 


gaagaaatcg 


aatcattggc 


caagaacatt 


acttttggtc 


aatcttactt 


gacgcacatg 


acggatacca 


ttaactttaa 


cggccaagaa 


cttccagatc 


ctgccagtct 


tgggccacgt 


tttacaggtg 


tcaaagacgg 


tgtcaaaaag 


caggaatgtt 


acgcagaggt 


tggttcgcaa 


atgattggga 


caaaattagt 


catgaacgga 


gaggagttag 


atccagatcc 


attcatggaa 


gtggttgaaa 


atccacaaat 


ggtggactaa 



<210> 79 
<211> 669 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 79 

tctaagagag gagaaaatat ggaagcaatt 
gtcatctgta ctggtctggg cttgcttgta 
caaacacctg tcaaagagac gaatttgcag 
tcgaccgaaa aggaagtgaa gaaggaagaa 
acagtagatg tcaaaggtgc tgtcaaatcg 
cgagtcaatg atgctgttca gaaggctggt 
ctcaatctag ctcagaaagt tagtgatgag 
gaagcagtta gtcaacagac tggttcgggg 
gtcaatctca acaaggccag tctggaagaa 
cgagctcagg acattattga ccatcgtgag 
ctcaagaagg tctctggcat tggtggcaaa 
gtggattaa 



PCT/US02/11524 



gggaaatggg 


tcgaagtcga 


agctatgtct 


720 


ggacaaaaag 


atatgtatct 


ccttcaccat 


780 


ccaggtgtca 


aacgcattcg 


tttctttatg 


840 


aaatgtcttg 


aaaatgttgg 


actccttcgt 


900 


attgttccaa 


ttcaattttt 


gaaagccttg 


960 


acagtcggaa 


aaaccaatat 


tggatgtatc 


1020 


actatctata 


tctacaatgt 


ctgcgaccat 


1080 


gctatttctt 


atacgacagg 


agttccagcc 


1140 


acttggaaac 


aagctggagt 


gtataacctt 


1200 


gctttgaatg 


agtatggttt 


gccatgggtt 


1260 










atcgagaaaa 


tcaaagagta 


taaaatcatc 


60 


ggaggatttt 


tcctgctaaa 


accagctcca 


120 


gctgaagttg 


cagctgtttc 


caaggactca 


180 


aaggaagaac 


cccttgaaca 


agatctaatc 


240 


ccagggattt 


atgacttgcc 


tgtaggtagt 


300 


ggcttgacag 


agcaagcaga 


cagcaagtcg 


360 


gctctggttt 


acgttcctac 


taagggagaa 


420 


acagcttctt 


caacaagcaa 


ggaaaagaag 


480 


ctcaagcagg 


tcaagggact 


gggaggaaaa 


540 


gcaaatggca 


agttcaagtc 


agtagacgag 


600 


acaatagaaa 


agcttaaaga 


ctatgttaca 


660 



669 
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<210> 80 

<211> 1524 

<212> DKTA 

<213> Streptococcus pneumoniae 

<400> 80 









aagaatttct 


ctattcccct 






agttttctat 






attttctcag 


catcttatct 


fr- rr r* -t- 4- f- rr t- t- rr 




ggctttgttt 




ctgtctcttt 


atccaatttc 


cgtggaaatc 










ctttggattt 


tggtttgttt 


ttcaaaattg 












gaaagggtac 


ggattttgcc 












ggcaagtcta acggtcgtgc 












gaagcctttc 


aagctttaac 












ccagaagggc 


agagaaattt 












atttaccaga 


ctctcaatat 












ataggagaaa 


acttgtccag 






aaggctgfcgg 


tttggattaa 


gacgcacttt 


ccagacccta 


tgggcaatta 












gaggagatga 


atgagcttta 












atgcaggtag gttttttcat 












caagaaaagt 


tgaaatggct 


gacttatccc 




ttttccctta 






ttttcagcat 


cggttattcg 


cagtctcttg 








tggggttaag 


ggcttggata 


attttgcctt 


gacggtgctt 








aaactttttc 


ttgacagcag 


gaggagtctt 


gtcctgcgct 




tatgctttta 


tcctgaccat 


gaccagcaaa 


gaaggggagg 


ggctcaaggc 


tgttactagt 


1080 


gaaagtctag 


tcatctcctt 


gggcatattg 


cccattctat 


ccttctattt 


tgcggaattt 


1140 


caaccttggt 


ctatcctttt 


gacctttgtc 


ttttcctttc 


tttttgactt 


ggtcttctta 


1200 


ccgctcttgt 


ctatcttatt 


tgtcctttcc 


tttctctatc 


cagtcattca gctgaacttt 


1260 


atctttgaat 


ggttagaggg 


cattattcgc 


ttggtctcgc aggtggcaag gagaccactt 


1320 


gtctttggtc 


aacccaacgc 


atggctttta 


atcttattgt 


taatttcctt 


ggctttggtc 


1380 


tatgatttga 


ggaaaaacat 


taaaggatta 


acagtattga gtttattgat 


tacaggtctc 


1440 


tttttcctta 


ccaagtatcc 


actggaaaat 


gaaatcacca 


tgctggatgt 


ggggcaagga 


1500 
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gaaagtattt tctacgggat gtaa 



<210> 81 

<211> 261 

<212> DNA 

<213> Streptococcus pneumoniae 



<400> 81 



acggtattat caagtagata a 



<210> 82 

<211> 867 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 82 





1f> 






tataacctat 


tattaaccat 


tttattagta 


60 


atgcaaccaa 


ccaaaaacca 


atccagcaat 


12 0 


gaacgcagta 


aagctcgcgg 


ttttgaagct 


180 


tttttctggc 


tagccattgc 


cttagcattg 


240 








261 



aatagaaata 


gttggaggaa 


atatatgctc 


tcatggttag 


cacgcgttat 


taaagggatt 


60 


gtaattgctc 


ttggatttat 


cctaccggga 


atttccggag 


gggttctagc 


agcaatctta 


120 


ggaatctatg 


aacgaatgat 


tggctttctg 


gcccatccct 


ttaaagactt 


taaagaaaat 


180 


gttttgtact 


ttattccagt 


tgccatcggt 


atgcttctgg gaatcggctt 


attttcctac 


240 


ccgattgaat 


acctgcttga 


aaattatcag 


gtttttgtat 


tatggagctt 


tgcgggagct 


300 


attatcggta 


cagttcctag 


cctcctcaaa 


gaatcaactc 


gagaatctga 


ccgagacaag 


360 


attgatttag 


cttggttatg 


gacaaccttt 


atcatttctg 


gattaggact 


ctatgcctta 


420 


aattttgtcg 


ttggaacctt 


aagcgccagc 


tttcttaact 


tcgtcctagc 


aggcgcacta 


480 


ttggcccttg 


gcgtcttggt 


tcctggcctc 


agcccatcaa 


atttactttt 


gattttggga 


540 


ctctatgctc 


ctatgttgac 


tggttttaaa 


acttttgatt 


tcttgggaac 


cttctttccg 


600 


attggaattg 


gtgcaggtgc 


aactctcatc 


gttttttcaa aattgataga 


ttatgcctta 


660 


aacaactacc 


actcacgcgt 


ctatcatttc 


atcatcggta 


tcgtcctatc 


aagtaccctt 


720 


ttgatcttaa 


ttccaaatgc 


aggaaacgct 


gaaagtatcc 


aatacacagg 


actttcactt 


780 


gtcggttatg 


tcatcatcgc 


cttcttcttt 


gcgctgggaa 


tctggcttgg 


tatttggatg 


840 
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agtcaattgg aggataaata taaataa 867 

<210> 83 

<211> 636 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 83 



atcatgtttt 


acttgagttt 


gtcaaggatt 


gctttaagct 


cctctactag tttagtttct 


60 


gtctctgctg 


agccattttc 


ttctttcacg 


aaatcaaggg tttcttggag aaggttttgg 


120 


gctttggcaa 


ggactttttt 


atccgctttt 


tctgcatcta gctgtcctag aaccttgatc 


180 


aattccgtgc 


ttaattgctg 


gatttctgac 


tctttcttac 


ggcgaatcag ccagaaggca 


240 


atcacgccta 


ggagggcaag 


tagactgacc 


acaatcactc 


ctgccggaac tgagtttgtt 


300 


tcagtcatct 


tatctgaatc 


cttactatct 


tccgttcctt 


gttttgcatc cttcttgtcc 


360 


tgtgcaggct 


tgctgtcgct 


agcatttgct 


ttcacatctt 


tgagagagtc caaggcagcc 


420 


cagccttcac 


agactctact 


gcagtatgca 


gaccttactc 


tgtcaaggca ctatcttccg 


480 


gagctttttg 


agcatctagg 


aggacagcct 


tggttgcatc 


gattttcgga tcagatactg 


540 


ttgccaaagc 


tttcaagcgt 


tggtctaact 


cttgactcaa 


ggcacgaagt tcagacttgt 


600 


caacttgctc 


ttgagcttgt 


gtgctcgttg 


agctag 




636 



<210> 84 
<211> 744 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 84 

aataggatta gaattattaa gaaagttggt tctttattgg aaacgatagt atttttaatc 60 
tctgtttttc tagcaggtgt tttatccttt ttttctcctt gtatttttcc tcttctgcca 120 
gtctatgctg ggattttatt ggatgatcaa gaaagtgcaa aaagcttttc tttgtttggg 180 
agaaaggttc tctggtcagg cttgattcga acactttgct ttatcgctgg tatctctctc 240 
attttcttta ttctaggctt tggtgctggt tactttggtc atattctcta tgcaaattgg 300 
tttcgatatg gcatgggagc tattattatc attttgggtc ttcaccagat ggaaattttt 3 60 
catttgaaga aattagaagt tcaaaaaagt tttaccttta aaaaatcaga ttctaatcgt 420 
tattggtcag cttttttact tggtattacc tttagctttg gttggacacc ttgtattggt 480 
ccagttttaa gttctgtttt agcacttgcg gcttctggag gcaatggcgc ttggcaagga 540 
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gcgatttata ctctcattta cactctgggc atggcccttc ctttcttggt attggcacta 600 

gcttcaggtc tagtcatgcc atattttagt aaaatcaagc gtcatatgat gctactaaag 660 

aaaattggtg gtttcctcat tgttttaatg ggaattttgt tactattagg acaagtaaat 72 0 

gttctagctg gaatttttga ataa 744 

<210> 85 

<211> 936 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 85 













cttttgtgct 


tgggggcttg 










tttatcctat 


ctacgctatg 


gttaaggaag tatctggtga cttgaatgat 




gttcggatga 


ttcagtcaag 


tagtggtatt 


cactcctttg aaccttcggc aaatgatatc 


240 


gcagccatct 


atgatgcaga 


tgtctttgtt 


taccattctc atacactcga atcttgggca 


300 


ggaagtctgg 


atccaaatct 


aaaaaaatcc 


aaagtgaagg tcttagaggc ttctgaggga 


360 


atgaccttgg 


aacgtgtccc 


tggactagag 


gatgtggaag caggggatgg agttgatgaa 


420 


aaaacgctct 


atgaccctca 


cacatggcta 


gatcctgaaa aagctggaga agaagcccaa 


' 480 


attatcgctg 


ataaactttc 


agaggtggat 


agtgagcata aagagactta tcaaaaaaat 


540 


gcgcaagcct 


ttatcaaaaa 


agctcaggaa 


ttgactaaga aattccaacc aaaatttgaa 


600 


aaagcgactc 


agaaaacatt 


tgtaacacaa 


catacagcct tttcttatct agcgaagaga 


660 


tttgggctta 


atcaacttgg 


tattgcaggt 


atctctcctg aacaagaacc aagtccacga 


720 


caactaacag 


aaattcagga 


atttgttaag 


acctataagg ttaaaacgat ttttacagaa 


780 


agtaacgctt 


cttcaaaagt 


agctgaaact 


cttgtcaaat caacaggtgt gggtcttaaa 


840 


actctgaatc 


ctttagagtc 


agacccacaa 


aatgacaaga cctatttaga aaatcttgaa 


900 


gaaaatatga 


gtattctagc 


agaagaatta 


aagtga 


936 



<210> 86 

<211> 390 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 86 
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aaggaaaaca gtatgttaaa aaatctaaaa tcgttcttgc ttcgaggaaa tgttattgac 60 

cttgctgtcg gtgttgtaat tgcctctgct tttggtgcta tcgttacttc acttgtaaac 120 

gacattatca ctcctcttat tttaaatccc gctttgaaag ctgctaaagt tgaacgtatc 180 

gctcaacttt cttggcatgg agtcggctat ggtaacttct taagtgctat tatcaatttt 240 

atctttgtgg gtaccgccct cttctttatt atcaagggca ttgaaaaagc acagaagctg 300 

actggcataa aggaagaaaa aactgacgaa aaaaaaccaa ccgaattgga agtccttcaa 3 60 

gaaataaaag ctctccttga gaaaaaataa 390 

<210> 87 

<211> 1023 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 87 | *\) 



gaaatgcatg 


caaaaatgcg 


aaataaaaaa 


caaataaacc 


tag^-ataat ttttataatc 


60 


tgcctaggtc 


ttcttattac 


aatatttttg 


tcattaaagc 


ttggaacaaa agaaattaat 


120 


atcagagatt 


ttttagcagc 


ttttggaatg 


ggtaatacaa atgatgattt tattaaatca 


180 


attatatata 


aaagaatacc 


tagaactatt 


tttgcaattt 


tagcaggttc tagtctf ;fcc 


240 


ataagcggtg 


tattgatgca 


atcagttact 


agaaacccaa 


tagctgatcc aggtatactc 


300 


ggtataaaca 


caggagcaag 


tcttagtgta 


gtaattggtc 


tttctttttt aggaatttca 


360 


tcaagcataa 


gccatataag 


ttttgcaatc 


attggtggct 


tagtaagtgc aatttttgta 


420 


tacgcgattg 


ctgtaagcgg 


aaaagcaggc 


cttaccccta 


taaaacttgc cttatcagga 


480 


acttgtgtta 


gtatggcttt 


aagcagtttt 


gtaagttttt 


taattttacc gaataataac 


540 


gtcttagaca 


aatttagatt 


ttggcaaata 


ggtagccttg 


gagcagctac attatcttct 


600 


atatctacac 


tactaccttt 


tataatttta 


ggtcacttga 


tagctatatt tatttcatca 


660 


gatttaaacg 


ctttagctat 


gggtgatgaa 


atggctgttg gtcttggagt taatgttaat 


720 


aggataagat 


cacttgcaat 


aattgcaagt 


gtgcttttat 


gttcaagtat tactgcaatt 


780 


ggtggaccta 


ttggcttcgt 


aggtcttata 


gttcctcact 


tttgtggctt atttataagc 


840 


aaagatatac 


gcacaatgac 


catttcttca 


tcttttatag gtgcagagct cttgcttata 


900 


tgtgatataa 


tcggccgtat 


gttaggtaaa 


ccaggtgaaa ttgaagtagg gataattact 


960 


gcaataatcg 


ggggtccagt 


acttatttat 


gtaactatga 


aaaatagagg ggttaataac 


1020 
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taa 1023 



<210> 88 
<211> 1011 
<212> DNA 

<213> Streptococcus pneumoniae 






<400> 88 
ctaatgcaaa 


atttaattat 


aggtattcaa 


aaaagaaaaa atagaataac actattttcc 


60 


tcactatttt 


tattaataat 


aatcagtcta 


tcatttttca ttttacttat cggagatgaa 


120 


agttattctt 


tttcaacttt 


gattaaagtc 


ttaaatagtg aaactgttcc tggagctagt 


180 


ttttcgatta 


tggaaattag 


attaccaaaa 


ttattagcag gaattatagc tggctggtct 


240 


tttggattgg 


caggatttat 


ctttcaaact 


atgttaagaa atcctcttgc aagtcctgat 


300 


ataatcggtg 


tcacaagttc 


ttcatctatt 


gcagcggtct tttgcatatt ggtattaaaa 


360 


acaaatagtt 


taactactgg 


aattatttca 


ataacttgtg gactaacatc atctttaata 


420 


ttatttttac 


tagctaaaaa 


agatggtttt 


tcagcagcaa gactgataat attaggtatt 


480 








tcatttttat tgttgaaagt agcaagatat 


540 


gaattacaag 


aagttatgag 


atggctcagt 


ggctctttat cttttacaaa gttagatgac 


600 


atacctcttg 


ttctaatagt 


aagtattatt 


gctactatat tagttttatt ttttaataaa 


660 


agactagaaa 


ttattgaact 


tggtgaagaa 


atagcaatcg gacttggagc aaatcccgag 


720 


ctttcaaggc 


ttgttttaat 


tttttgcgct 


gtatctttaa ctgctttttc tacttcaatt 


780 


acaggaccaa 


tagcttgtat 


atctttttta 


gctggtccca tagccttaaa tattggcaag 


840 


aaaagaagcc 


caatattagc 


tggattggtt 


ggaattttac tagttttgtt atcagacata 


900 


ttctctcaaa 


atattttacc 


agctagatat 


ccagtaggtg ttgtaactgg cttgttaggt 


960 


tcaccatact 


taatatactt 


actaataaaa 


atgaacagga ggaatatata a 


1011 



<210> 89 
<211> 936 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 89 

aataaaggta ggggtgttat gaattgtttc ttgaaaatga ataatgtaag tgttcgttat 60 
gatgacgtaa tagctttaaa agatataact ttacaaataa ataagggaga tttcattggc 120 
ttattaggtt caaatggtgc aggtaaatct acgttaatta attctattgt aggttttcaa 180 
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gagatttatt 


taggagaaat 


agagtattgt 


gataaagatt 


tgatagttag 


ttctcaacct 


240 


tttgctcatt 


taggctttac 


tcctcaaacc 


acagtaattg 


atttttatac 


tactgtgaag 


300 


gacaatgtaa 


tattggggct 


gaaccttgct 


ggaaagtttg 


ggaaaaatgc 


tgagaagttg 


360 


tgtcaaatag 


ccttagaaat 


tgttgggtta 


gctgataaaa 


aaaataattt 


ggtagaaaca 


420 


ttgtcaggtg 


gacaactgca 


acgcgtccag 


attgctagag 


caatagctca 


taatccagat 


480 


ttttatattt 


tagatgaacc 


taccgttggt 


ttagatactg 


aatctgccga 


aaaattttta 


540 


atgtatttaa 


aagataagag 


tttggaagga 


aaaactatta 


tcatatcttc 


acatgacata 


600 


aatctactcg 


aaaagttttg 


taaaaaaata 


ctttttttac 


aaaatggctc 


catatcattt 


660 


tttggtgata 


tgcgtgactt 


tgtagataat 


tcaactatca 


aattaaattt 


ttcaatgcag 


720 


aatagaattt 


ctagatatca 


aattgaattt 


ttagaaaatt 


ttagatttaa 


agttcacatc 


780 


gaagataatg 


atagttttac 


aatagaagtc 


cctatagaag 


aaaagatctt 


agatgttatc 


840 


aatgaggtag 


gaaaagcatg 


tgaaattaaa 


aacttttcaa 


caagtaaatt 


aaccttacaa 


900 


gaaagttatt 


tgcaaagaat 


aggaggagaa 


aaatga 






936 



<210> 90 

<211> 846 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 90 



attaacctta 


caagaaagtt 


atttgcaaag 


aataggagga 


gaaaaatgaa 


ggctgatcaa 


60 


ttaaggcaca 


aatcggactt 


aggtttaaga 


ggtctagcga 


ttattgctaa 


aaatgagatt 


120 


attgcttttt 


ttagaagtaa 


aggtttaatt 


atttctcagt 


ttctacaacc 


aatcttatat 


180 


gttgttttta 


taataatagg 


attaaattct 


tcgataaaga 


acattcagtt 


taatgatata 


240 


aaaacctctt 


atgcagaata 


tacaatcatt 


ggtgttatag 


ctttattgat 


aatcgggcag 


300 


atgactcaag 


ttatttatag 


ggtgacaata 


gataaaaaat 


atgggctact 


tgctcttaag 


360 


ttatgcagtg 


gagttcgtcc 


tttatattat 


attttaggga 


tgagtatcta 


ttctatatta 


420 


gggttgatag 


ttcaagaaat 


tattatatat 


ataattacgt 


tagcgtttga 


gataaatatc 


480 


gcaatggata 


gattttttta 


tacagttttg 


ttatctattg 


ttgttttatt 


attttgggac 


540 


tcccttgcaa 


ttttacttac 


aatgtttatc 


aatgattaca 


gaagacgtga 


tattgtaata 


600 


cgttttgtac 


taacaccgct 


tggttttaca 


gctcctgttt 


tctacttaat 


agattctgct 


660 
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cctagtatbg tgagatggat tggtcagtta aatcccttaa cttatcaatt aactattttg 720 

agaaactttt attttaaaaa ttcaacaact ttggaattag ttttcttatt gttaacatca 780 

ttacttgtcc ttatatctgt atcttttatt ataccaaaga taaaattgat actgatagaa 840 

agataa 846 

<210> 91 

<211> 1038 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 91 



aatatgttaa 


aggaaataaa 


aaggagaaac agaatgaaaa ataaacgttt aattggaatt 


60 


attgctgcat 


tagcagtctt 


agtagcagga agcttgattt attcttcaat gaataaatca 


120 


gaagctcaga 


ataataagga 


tgagaagaaa ataaccaaga ttggtgtgct tcaatttgtg 


180 


agccatccat 


cccttgattt 


gatttataaa gggatccaag atggacttgc agaagaagga 


240 


tataaagatg 


atcaagttaa 


aattgatttt atgaactcag aaggtgacca aagtaaggtt 


300 


gcgacaatga 


gtaaacaatt 


ggttgcaaat gggaatgacc ttgtggttgg tatcgcaaca 


360 


ccagcagccc 


aagggttggc 


tagtgcaaca aaagacctac cggttatcat ggccgctatt 


420 


acagacccaa 


ttggtgctaa 


cttggttaaa gatttgaaaa aaccaggtgg caacgttaca 


480 


ggggtatctg 


accacaatcc 


agctcaacaa caagttgaac tcatcaaggc tctgacaccg 


540 


aatgtgaaaa 


caatcggagc 


tctttactca agtagcgaag acaattcaaa aacacaggtc 


600 


gaagaattta 


aggcttatgc 


tgaaaaagca ggtctgacag tggaaacatt tgcagttcct 


660 


tcaacaaatg 


aaattgcctc 


aactgtcact gttatgacta gcaaggtaga tgctatttgg 


720 


gttccaattg 


ataacaccat 


tgcatcagga tttccaacgg ttgtctctag caatcaaagt 


780 


tctaagaaac 


caatttatcc 


cagtgcgaca gctatggtag aagtaggtgg tttggcatca 


840 


gttgtaattg 


accaacatga 


ccttggtgtg gcaacaggta aaatgattgt gcaagtcttg 


900 


aaaggtgcaa 


aaccagccga 


taccccagtc aatgtctttt caactggtaa gtcagtcatc 


960 


aataaaaaaa 


tagcacaaga 


actaggtatt actattcctg agtctgttct caaagaagca 


1020 


ggacaagtca 


tcgaataa 




1038 



<210> 92 
<211> 792 
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<212> DNA 

<213> Streptococcus pneumoniae 
<400> 92 



0ca.aa.caa.tc 


ttgaaaggag 


ccaagttaag 


caaatgacag caattgtaga attaaaaaat 


60 


gcaaccaaaa 


tcgttaaaaa 


tggctttgat 


gaagaaaaga ttattttaaa tgatgtttcc 


120 


ttagaaattt 


ttgaacggga 


ctttatcacg 


attttgggcg gaaatggtgc tggaaaatca 




actctcttta 


acactatagc 


agggacctta 






ggtgaagatc 


tcactaagtt 


ttcacccgag 


aagcgtgcca agtacctgtc tcgtgtcttc 


300 


caagatccaa 


agatggggac 


agctccccgt 


atgacggtcg ctgaaaatct tttaatcgcc 




aagtttcgtg 


gtgaaaagcg 


tggattgtta 






tttcaggcaa 


ccattgaaaa 


agtaggaaat 


ggtcttgaga aacacttgaa tacaccgatt 


480 


gagttcttat 


caggtggaca 


aagacaggct 


ttgagtctct tgatggcaac cttgaagcga 


540 


cctgaattac 


tcctgttaga 


tgagcatact 


gctgccctgg atccaaagac tagtgttgct 


600 


ttgatggaat 


tgacagatga 


atttgttaag 


aaagatcagc taacagccct tatgattact 


660 


catcatatgg 


aagatgctct 


caaatacggc 


aatcgcttga ttgtcatgaa agaaggacga 


720 


attatccaag 


atttaaacca 


agaagaaaaa 


gcaaaaatga aaatctctga ttattatcaa 


780 


ctctttgaat 


aa 






792 



<210> 93 

<211> 741 , 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 93 

aaagaaaatg gtacaatatt tctaagagaa aatacaatgg gaggtaaaat gaggttatta 60 
cctataagaa aaatatcacg tcagtctaaa aggttagcac tttttttgac gttttgtgct 120 
ggatatgtgg atgcttacac ttttattgtt cgcgggaata cccttgtagc tggacaaact 180 
ggaaatgttg tctttctttc agtagaatta attaaaaata atgtttcgga tgttagggac 240 
aaggttctca ccttgctagc gtttatgatg ggagtctttt tattaacgat ttataaggaa 300 
aaattgagaa ttgtgaaaaa acctattctg tccttgattc ccttggcaat cttatcaatc 360 
attattgctt ttgtgccgca aactgtggat aatatctatc tagtgccgcc cttggccttc 420 
tgtatgggac tggtgacaac tgcttttgga gaagtgtcgg gtattgccta taataacgct 480 
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tttatgacag ggaatatcaa acggaccatg 


ctggcttttg 


gagattattt 


ccgaaccaag 


540 


cacactcctt 


ttttgcgtga aggattcata 


tttgttagcc 


tgcttagtag 


ttttgtcctt 


600 


ggcgttgtct 


tttcagccta tttgacgatt 


ttctatcatg 


aaaagaccat 


tcttggtgtt 


660 


cchattatga 


tgagcgtttt ttacctcagc 


atgctttttg 


cctcttggca 


gaaaaaagta 


720 


aaagaaaaag 


cttcatttta g 








741 



<210> 94 

<211> 864 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 94 



gaaaagaggt 


gtcctatgat 


taaaaaaatt 


taccccattt 


ttaccatttt 


actaggtgct 


60 


gctatttatg 


cttttggact 


gacttatttt 


gtagttcccc 


atcatctctt 


tgaaggaggg 


120 


gcgacaggca 


ttaccctcat 


caccttttat 


ctttttaaaa 


tccctgtttc 


cctcatgaac 


180 


ctgctgatta 


atattcccct 


tttcatccta 


gcttggaaga 


tttttggagc 


caaatccctc 


240 


tattctagtt 


tactaggaac 


cttagctttg 


tccggctggt 


tagctttttt 


tgagcatatt 


300 


ccccttcata 


ttgatcttca 


aggtgattta 


ctaatcacag 


cccttatagc 


gggaatccta 


360 


ttgggaattg 


gccttggaat 


tatttttaat 


gctggaggta 


caactggcgg 


aactgatatt 


420 


ctagctcgta 


ttctcaacaa 


atacactcat 


atatccatag 


gaaaactgct 


ctttatctta 


480 


gatttttgta 


ttctcatgtt 


gattctccta 


atcttcaagg 


atttgagatt 


ggtttcctac 


540 


acgcttttgt 


ttgattttat 


tgtttctcgt 


gttattgatt 


tgattggtga 


aggaggatat 


600 


gccggcaaag 


gctttatgat 


tatcacaaaa 


cgtcctgacc 


aacttgctaa 


ggcgattaat 


660 


gatgacctcg 


gaagaggtgt 


tacttttatt 


tctggtcaag 


gctactatag 


taaagaaaat 


720 


ttgaaaatca 


tctactgtat 


tgtcggaaga 


aatgaaattg 


tgaaaacgaa 


ggaaatgatt 


780 


catcgaatcg 


atcctcaagc 


ctttataact 


attacagaag 


cccatgaaat 


cctaggagaa 


840 


ggcttcacct 


ttgaaaaaga 


ataa 








864 



<210> 95 
<211> 300 
<212> DMA 

<213> Streptococcus pneumoniae 
<400> 95 

aaattatttg gaggaatcat taacatggca aacaaacaag atttgatcgc taaagtagca 60 
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gaagctacag aattgactaa gaaagactca gcagcagcag 
gtagctgact atcttgcagc tggtgaaaaa gttcaattga 
gttcgtgagc gcgcagaacg taaaggtcgc aacccacaaa 
gcagcttcta aagtaccagc attcaaagct ggtaaagctc 

<210> 96 

<211> 1095 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 96 



tat act tact 


tatggagaaa 


atacatgaaa 


cgtgagattt 
















aaggaatatg 
















afccctttatc 


aatgctaata 




aggtgtttca 


cagacaacta 


ctttctagtc 


tttacaagta 


tctaaccgag 


gaggttgaaa 


ttctatcgta 


atgtaatgaa 


aaaagtttcc 


accaagaaaa 


agagctgaaa 


atatcaagca 


aaaactcttt 


ctaggtgatg 


tatatcgatc 


aagagcaccc 


acaacagctt 


tcaaatcgag 


aataaagaac 


gagatttagc 


cattattgcc 


cttctcttgg 


gaagctgtta 


atctagatct 


aagagatctc 


aatctaaaaa 


cgaaaaggtt 


gcaaacgtga 


ctcagtcaat 


gtcgctgctt 


aattatctgg 


ccattcggaa 


tcaacgctat 


aaaacggaaa 


ttaactctct 


acagaggtgt 


tcctaatcgt 


atcgatgctt 


gctaaatact 


cagaggattt 


taaagtgcgt 


gtaacacccc 


gcaactaggc 


tctatgatgc 


gactaaatca 


caagttttag 


gctagcacac 


aagtcactga 


cctctatacc 


catattgtta 


ctggatagtt 


tatga 







ttgaagctgt 


atttgcagca 


120 


tcggttttgg 


taactttgaa 


180 


ctggtaaaga 


aatgacaatt 


240 


ttaaagacgc 


tgttaaataa 


300 



t ac t cjcjaacg 


aatcgacaaa 






ggctgtgccc 






cagctgggtt 










tacgtgaacg 


tcccttgctg 




tcaatcgaac 


cttatcagca 


360 


acgatcaggg 


ggaaccttat 


420 


agaaagaaac 


ccttgctgcc 


480 


aaacagaagg 


ttttctaact 


540 


ctctctcatc 


attcaacaaa 


600 


catctggtgt 


tegcttatet 


660 


tgatggttat 


tgatgttact 


720 


ttgctaaacc 


ttatttagag 


780 


aaacagatac 


ageccttttt 


840 


ctagcgttga 


gaaaatggtt 


900 


ataaactgcg 


ccatacacta 


960 


tcagt caeca 


actaggacat 


1020 


gtgatgaaca 


aaagaatget 


1080 



1095 
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<210> 97 
<211> 405 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 97 

ctgagggctg caccgacagc gatccccata ccaccaccta cgataccatt ggcaccaagg 60 
ttcccagcat caaggtcagc gatatgcata gatccacctt tccctttaca ggttccagtg 12 0 
tatttaccaa ggatttcagc catcattccg ttgaggtcaa tccctttagc aatagcttgc 180 
ccgtgtccac ggtggtttga ggtaatcaga tcatctggat tgagagctaa catagccccc 240 
acgttagctg cctcttcacc aacagaaaag tgcgtcattc ctggcacttt ccctttcttt 300 
actaattgtg caatttttaa gtccatgcga cggatttctt ccatcttacg gaacatttct 360 
agcaaaagat ttttatctaa agttgacatc ttcttgcctt tctaa 405 

<210> 98 

<211> 1716 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 98 



actctacaag 


agaggagttc 


aataatgaat 


aaactaattg catttatcga gaaaggaaag 


60 


ccttfcctttg 


aaaaactatc 


tcgtaatatc 


tatcttcgtg ctattcgtga tggtttcatt 


120 


gcaggtatgc 


ctgttattct 


cttctcaagt 


atctttatct tgattgcctt tgtaccaaac 


180 


tcatggggct 


ttaaatggtc 


tgatgaagtt 


gtagcctttc tgatgaaacc ttatagctat 


240 


tctatgggta 


ttctggctct 


cttggtagct 


ggtacaacag ctaagtcatt gactgactca 


300 


gtaaaccgga 


gcatggaaaa 


aaccaatcaa 


atcaagtata tgtcaacatt gttggcagca 


360 


attgttggtt 


tgttgatgtt 


ggcagctgat 


cctatcgaaa gtggtctagc tactggattc 


420 


ttggggacaa 


aaggtttgct 


ttcagccttc 


cttgctgcct ttgttactgt agccatctat 


480 


aaggtttgtg 


ttaagaacaa 


cgtcactatt 


cgtatgcctg acgaagttcc accaaatatc 


540 


tcacaagtct 


ttaaagatgt 


gattccattc 


actctatctg ttgtttctct ttatgctctt 


600 


gacttattag 


cacgttattt 


tgttggttct 


agtgtggcag aatcaatcgg taaattcttc 


660 


gcaccactct 


tctcagcagc 


agacggatac 


cttggtatta ccattatctt tggtgccttt 


720 


gccttcttct 


ggtttgttgg 


gattcatggt 


ccatctatcg ttgaaccagc tatcgcagct 


780 


attacctatg 


ccaatgccga 


agttaacttg 


aaccttctcc aacaagggat gcatgcagac 


840 
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aagatfccfcta 


cttctggtac 


acaaatgttt 


atcgtfcacca tgggtggtac aggtgcgaca 


900 


ttggtcgttc 


catttatgtt 


catgtggttg 


acaaaatcga aacgtaaccg tgcaatcgga 


960 


cgtgcttcag 


tagttcctac 


cttcttcggt 


gtaaatgaac caatcttgtt tggtgcacct 


1020 


cttgttttga 








1080 




tctttattga 


aactcttgga 


atgaacfccat tcactgctaa fccfcaccatgg 


1140 






tctagttctt 


ggaactaact tccaagtgct atcattcatt 


1200 




ttctaatcgt 






1260 




















1380 










1440 


acaagtggtc 


tccttgcaaa 


tgctttgaat 


aaggcagcag cagaatacaa tgtccctgtg 


1500 


aaagcagcag 


caggcggcta 


tggtgctcac 


cgtgaaatgt taccagagtt tgatcttgtt 


1560 


atccttgccc 


ctcaagttgc 


ttcaaacttt 


gaagatatga aagcagaaac agataagctc 


1620 


ggtattaaac 


tagcgaaaac 


agaaggcgct 


caatacatca aattaactcg tgatggaaaa 


1680 


ggtgctcttg 


cattcgtaca 


agcgcaattc 


gattaa 


1716 



<210> 99 
<211> 807 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 99 

gagtttatta tggtttcttc ggaatttatc tcaaagattg aatttgcttg caataagaaa 60 
gaaagtcttt atagtcaaag caaatttaag tatgcgattc gttcgatgtt cgcaggtgca 120 
tttttaacct tcagtactgc tgcaggtgca gttggggctg acttgattaa taaaattgca 180 
ccaggtagtg gacgcttcct ctttccattc gtttttgctt ggggcttggc ctacattgtt 240 
tttttgaatg ccgagttggt cacttcaaac atgatgttct tgactgctgg tagtttctta 3 00 
aaaaaaatct cttggagaaa aacagctgag attttactat actgtacctt gttcaacctt 360 
atcggagcct tgatagcagg gtggggcttt gctcattcgg cagcctatgc gaatctgaca 420 
cacgatagtt tcatctcagg tgttgttgag atgaagttag gccgctcaaa tgaattggtc 480 
ttgcttgagg cgattttggc aaatattttt gtaaatattg cgattctgtc atttattttg 540 



-78- 



WO 02/083855 



PCT/US02/11524 



gtcaaagatg gtggtgccaa 


actttggctt 


gtgttgtcag 


ctatttacat 


gtttgtattc 




ttaacaaacg agcacattgc 


ggcgaacttt 


gcttctttcg 


cgattgtgaa 


attcagtgtt 


660 


gctgcggatt caattgccaa 


cttcggtgtt 


ggaaatatgc 


ttcgccactg 


gggtgtgact 


720 


ttcatcggaa actttatcgg 


aggaggcctc 


ttgatgggtc 


ttccatatgc 


cttcctcaat 


780 


aaaaacgaag atacttatgt 


agattaa 








807 



<210> 100 

<211> 1356 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 100 



gaaataatgc 


ttgatttact 


gaaacaaacc 


atttttacca 


gagattttat 


ctttatcctg 


60 


attttgttag 


gtttcatcct 


tgttgtgacc 


ctcttattac 


tggaaaatag 


acgtgataat 


120 


attcagttga 


agcaagtcaa 


tcaaaaggtt 


aaagatttga 


ttgcaggaga 


ttattccaag 


180 


gttcttgata 


tgcaaggtgg 


gtctgaaatc 


accaatatta 


ccaataattt 


gaatgacttg 


240 


tcggaggtta 


ttcgtctcac 


tcaggaaaat 


ctagaacaag 


agagtaagag 


gctaaatagt 


300 


attctgtttt 


atatgacaga 


tggggttctt 


gcgactaacc 


.gtcggggtca 


gattatcatg 


360 


attaacgata 


cagccaagaa 


gcaactgggg 


ttggttaagg 


aagatgttct 


gaatagaagc 


420 


attttggaat 


tgctcaagat 


agaagaaaac 


tatgaattgc 


gtgatttgat 


tacccaaagt 


480 


ccagaattgt 


tgctagattc 


ccaagatatc 


aatggcgaat 


atttgaacct 


tcgagttcgc 


540 


tttgccttga 


tacgtcgaga 


gtctggcttt 


atttcaggtt 


tggtggctgt 


tttgcatgat 


600 


acgacggagc 


aggagaagga 


agaacgcgaa 


cgaagactct 


ttgtttccaa 


tgttagccat 


660 


gagttacgga 


ctcctctgac 


tagcgtaaaa 


tcctatcttg 


aagccttgga 


tgagggggct 


720 


ttgtgtgaaa 


ctgtagcacc 


agactttatc 


aaggtttctc 


ttgatgagac 


caaccgtatg 


780 


atgcgcatgg 


tgacggatct 


cctccatctt 


tcacgtattg 


ataatgctac 


cagtcaccta 


840 


gatgtggaac 


tgattaactt 


cactgctttt 


attaccttta 


tcctcaatcg 


ttttgacaag 


900 


atgaaaggac 


aggaaaagga 


gaaaaaatat 


gagttggtga 


gagattatcc 


catcaattct 


960 


atctggatgg 


aaattgatac 


agataagatg 


acgcaggttg 


tcgacaatat 


tttaaataat 


1020 


gctattaagt 


attcgccaga 


tgggggtaaa 


atcactgtca 


gaatgaagac 


aactgaagac 


1080 


cagatgattt 


tatccatttc 


tgaccacggt 


ttggggattc 


ctaagcagga 


tttaccacgt 


1140 
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atctttgacc gtttctatcg tgtggatcgt gctagaagtc gtgcacaagg tggtacaggt 1200 

ctaggactgt ctatcgctaa agaaattatc aaacaacata agggctttat ttgggccaag 1260 

agtgaatacg gcaagggttc aacctttacc attgtactcc cttatgataa ggatgcagtg 132 0 

aaagaagaag tatgggagga tgaagtagaa gactag 1356 

<210> 101 
<211> 594 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 101 

attcttgctt tattgtctgg tttattgtac catactagtg tatatgcagt taaaaaggag 60 

attcttgtga atacacggaa aaagacacaa tttatgacaa tgacagccct tttaacggct 120 

attgcgattt tgattccaat tgttatgcct ttcaagattg tcattccacc tgcttcctat 180 

actttgggga gccacatcgc tatttttata gccatgttct tgtcgccctt gatggcagtt 240 

tttgtcatcc tagcctctag ttttggattt ttgatggctg gctatcccat ggttatcgtt 300 

tttcgggctt tttcccatat atcttttggt gctttaggag ctctttacct acaaaaattc 360 

cccgataccc tagataaacc aaaatcttcc tggattttca actttgtttt ggctgttgtt 420 

catgcccttg ctgaagtatt ggcctgtgtc gttttttacg caacttctgg taccaatgta 480 

gaaaatatgt tttatgttct atttgtacta gttggatttg gtacaattat ccatagtatg 540 

gtagactata cattagcact agctgtctat aaagtgcttc gaaaacgccg ttaa 594 

<210> 102 
<211> 867 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 102 

attgctatta aaaagtgcta taataatagt atatatagaa gaaaagagga cggggatatg 60 

aagaacaaaa gaatatttaa agacttccaa gcttcaaaaa tgagtttaaa catttacaca 120 

agccccttgt tagcctttgt ttttgtcttc ataggagagt ttgtggcttt tactttgtat 180 

ggtattggct tgttagctct catcggactt gctagaaatt ttggagaggc tggtcaaaat 240 

cttgcaagct acttgcagac cttgcatcag agcttgacgg ataaaacaag tgactttcgt 300 

ttaattttag gattactggc ctttggtttt attcttaaca ctgtgttcag atggacaaga 3 60 

aaagttgaga aaagacctat tcgaaccttg ggattttata gagagaattt cctcagcaat 420 
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cttctgaaag 


gatttagtct aggcctggca 


ctttttcttc 


tgaccttgtt 


aggtttagtg 


480 


gtcttaggtc 


aatatcgttt ggaatccatt 


cacttgaatc 


cttattctct 


tgcctttgtc 


540 


gtctttacta 


tcccattttg gattttacag 


gggacagcag 


aagaagtggt 


ggcccgtgct 


600 


tggctacttc 


ctcaattggc ctcaagaacc 


aatctaaaac 


tagctattct 


tatatctagc 


660 


ctgttcttta 


ccctgcttca tatgggcaat 


tctggtctca 


cccctctatc 


tctagtaaat 


720 


ctctttttat 


tcggagttgc catggctctt 


taccttctca 


aaactgatac 


agtttggggt 


780 


gttgcaggta 


ttcatggtgc ttggaatttt 


gctcagggta 


atctctttgg 


gattttagtt 


840 


agtggtcaac 


cgtcagaacg tctctga 








867 



<210> 103 

<211> 2193 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 103 



gagaatattc 


ggaaaaggag 


actaaaaatg 


aagaaaaaat 


ttctagcatt 


tttgctaatt 


60 


ttattcccaa 


ttttctcatt 


aggtattgcc 


aaagcagaaa 


cgattaagat 


tgtttctgat 


120 


accgcctatg 


caccttttga 


gtttaaagat 


tcagatcaaa 


cttataaagg 


aattgatgtt 


180 


gacattatta 


acaaagtcgc 


tgagattaaa 


ggctggaaca 


ttcagatgtc 


ctatcctgga 


240 


tttgacgcag 


cagtcaatgc 


ggttcaagct 


gggcaagccg 


acgctatcat 


ggcagggatg 


300 


acaaagacta 


aagaacgtga 


aaaagtcttc 


accatgtctg 


atacttacta 


tgatacaaaa 


360 


gttgtcattg 


ctactacaaa 


gtcacacaaa 


attagcaagt 


acgaccaatt 


aactggcaaa 


420 


accgttggtg 


ttaaaaacgg 


aactgccgct 


caacgtttcc 


ttgaaacaat 


caaagataaa 


480 


tacggcttta 


ctattaaaac 


atttgacact 


ggtgatttaa 


tgaacaacag 


cttgagtgct 


540 


ggtgccatcg 


atgccatgat 


ggatgacaaa 


cctgttatcg 


aatatgccat 


taaccaaggt 


600 


caagacctcc 


atattgaaat 


ggatggtgaa 


gctgtaggaa 


gttttgcttt 


cggtgtgaaa 


660 


aaaggaagta 


aatacgagca 


cctggttact 


gaatttaacc 


aagccttgtc 


tgaaatgaaa 


720 


aaagatggta 


gtcttgataa 


aattatcaag 


aaatggactg 


cttcatcatc 


ttcagcagtg 


780 


ccaactacaa 


ctactctcgc 


aggattaaaa 


gctattcctg 


ttaaggctaa 


atatatcatt 


840 


gccagcgatt 


cttcttttgc 


cccttttgtt 


ttccaaaatt 


caagcaacca 


atacactggt 


900 


attgatatgg 


aattgattaa 


ggcaatcgct 


aaagaccaag 


gttttgaaat 


tgaaatcacc 


960 
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aaccctgght 


ttgatgctgc 


tatcagtgct 


gtccaagctg gtcaagccga tggtatcatc 


1020 


gctggtatgt 


ctgtcacaga 


tgctcgtaag 


gcaacttttg acttctcaga atcatactac 


1080 


actgctaata 


ccattcttgg 


tgtcaaagaa 


tcaagcaata ttgcttctta tgaagatcta 


1140 


aaaggaaaga 


cagtcggtgt 


taaaaacgga 


actgcttctc aaaccttcct aacagaaaat 


1200 


caaagcaaat 


acggctacaa 


aatcaaaacc 


tttgctgatg gttcttcaat gtatgacagt 


1260 


ttaaacactg 


gtgccattga 


tgccgttatg 


gatgatgaac ctgttctcaa atattctatc 


1320 


agccaaggtc 


aaaaattgaa 


aactccaatc 


tctggaactc caatcggtga aacagccttt 


1380 


gccgttaaaa 


aaggagcaaa 


tccagaactg 


attgaaatgt tcaacaacgg acttgcaaac 


1440 


cttaaagcaa 


acggtgaatt 


ccaaaagatt 


cttgacaaat acctagctag cgaatcttca 


1500 


actgcttcaa 


caagtactgt 


tgacgaaaca 


acgctctggg gcttgcttca aaacaactac 


1560 


aaacaactcc 


ttagcggtct 


tggtatcact 


cttgctctag ctcttatctc atttgctatt 


1620 


gccattgtca 


tcggaattat 


cttcggtatg 


tttagcgtta gcccatacaa atctcttcgc 


1680 


gtcatctctg 


agattttcgt 


tgacgttatt 


cgtggtattc cattgatgat tcttgcagcc 


1740 


ttcatcttct 


ggggaattcc 


aaacttcatc 


gagtctatca caggccaaca aagcccaatt 


1800 


aacgactttg 


tagctggaac 


cattgccctc 


tcactcaatg cggctgctta tatcgctgaa 


1860 


atcgttcgtg 


gtggtattca 


ggccgttcca 


gttggccaaa tggaagccag ccgaagcttg 


1920 


ggtatctctt 


atggaaaaac 


catgcgtaag 


attatcttgc cacaagcaac taaattgatg 


1980 


ttgccaaact 


ttgtcaacca 


attcgttatc 


gctcttaaag atacaactat cgtatctgct 


2040 


atcggtttgg 


ttgaactctt 


ccaaactggt 


aagattatca ttgctcgtaa ctaccaaagt 


2100 


ttcaagatgt 


atgcaatcct 


tgctatcttc 


tatcttgtaa ttatcacact tttgactaga 


2160 


ctagcgaaac 


gcttagaaaa 


gaggattcgt 


taa 


2193 



<210> 104 
<211> 774 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 104 

actagcgaaa cgcttagaaa agaggattcg ttaatggcaa aattaaaaat tgatgtaaat 60 
gatttacaca agcactatgg aaaaaatgaa gtcctaaaag gaattacgac taagttctat 120 
gaaggagatg ttgtttgtat catcggtcct tcaggttctg gtaagtcaac tttcctccgt 180 
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agcctcaatc 


ttttagaaga 


agtcactagc 


ggtcacatca 


ctgtgaacgg 


ctatgattta 


240 


actgaaaaaa 


caaccaatgt 


tgaccacgtc 


cgtgaaaata 


tcggcatggt 


attccaacac 


300 


ttcaacctct 


tccctcatat 


gtctgtattg 


gacaacatca 


cctttgctcc 


tattgagcac 


360 


aagttgatga 


ctaaggaaga 


agctgaggaa 


ttgggaatgg 


agttgcttga 


aaaggttgga 


420 


ctagcagata 


aagctaatgc 


caatccagat 


agcctatcag 


gtggtcaaaa 


acaacgtgtg 


480 


gccatcgctc 


gtggcctagc 


aatgaatcca 


gacatcatgc 


tcttcgatga 


accaacttct 


540 


gcccttgacc 


ctgagatggt 


tggagacgta 


cttaacgtta 


tgaaggaatt 


ggctgagcaa 


600 


ggcatgacca 


tgattatcgt 


aacccatgag 


atgggatttg 


ctcgtcaggt 


tgccaaccgc 


660 


gttatcttta 


ctgcagatgg 


cgagttcctt 


gaagacggaa 


cacctgacca 


aatctttgat 


720 


aacccacaac 


accctcgtct 


gaaagagttc 


ttagataagg 


tcttaaacgt 


ctaa 


774 



<210> 105 

<211> 372 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 105 



ctaggagaag 


ttatgcgtat 


tatctatcta 


attattggtt 


ttttatcgct 


gaccttggct 


60 


attgttgggg 


ttgttttacc 


cttgttgcct 


acaacacctt 


tccttttgtt 


gtctattgct 


120 


tgtttctcca gaagttccaa 


gcgattcgaa 


gattggcttt 


atcataccaa 


gctctatcaa 


180 


gcatatgtag 


ctgattttcg 


tgagaccaag 


tctattgcgc 


gtgaacgaaa 


gaaaaaaatc 


240 


atcgtctcta 


tctacgtctt 


gatgggaatt 


tctatttatt 


ttgcacctct 


tttaccagtc 


300 


aaaatcggtc 


tgggtgcttt 


gaccatcttt 


attacttatt 


atctcttcaa 


ggtcattcca 


360 


gacaaagaat 


ag 










372 



<210> 106 
<211> 555 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 106 

tactgtacgt tttatcatag aaatttttac tttattttct catcaaatga gatttgcatc 60 
aatctcttgt cttacttgcg tttcttcttc gctttcttca ttttgttagc catacgtttc 120 
atggactgtt tcatggcaaa ttcaccaatt ttacctttca aaccgccacc aaacatctgg 180 
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ctcatatctg gcattcctgc tcctccgaga 

atcattcctt caagggcaga catatccatt 

ttatttggat taatccccat ttgcttcatc 

tgcatgagct gtttagcctg gttaaagtcc 

tttccagaac cagcagcaat acgacggcga 

cgctcttcag gtgtcatcga agacacaatg 
accttcatgt tttga 



gctgataagt 


caggcatacc 


gccttgtccc 


cctcccatat 


ttggcatatt 


tttaggaagg 


attttattca 


tatccccaga 


cataacaccc 


ttgatgaatt 


tattgacttc 


gacgaatgta 


cggcttggat 


ttaacaaatc 


tgggttttca 


gcacgtttac 


gagcaatctg 


gcgttcatcc 



<210> 107 
<211> 396 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 107 

ttatcaggat caaaatatca aatgaaaaaa gaacaatttt atccgctagg aatttttcta 

gctgctatgt tgggcggact tgtccgatat ctagtttcca cctggttacc agccagtcca 

gatttccctt ggggcactct ctttgtcaac tatctgggaa ttttctgctt gatttatctt 

gtcaagggct atctggtcta taaggggact agtaagggct tgattttagc actggggacg 

ggtttttgcg gaggtttaac aactttttcc agcctaatgc ttgatactgt gaagctgctt 

gatacagggc gttatcttag tttgatactg tatttgcttt tgagtatcgg tggaggcctg 

cttttagctt actatttggg gaggaagaaa tggtaa 

<210> 108 
<211> 1998 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 108 

aaaaatatgg ccattagtca gatgaaaaga atctctctac tattttctaa aagtagtctt 

gatgatgttt taaaaactat tcaagaacta gagtcagtgc agttccgtga tttaaaggtt 

caggataact ggtcagaagc tctagaaaaa gatgaagttg tatttccaac tattcaaatt 

tttcatactt ctaattccaa tcatggggtt attgagggaa atgatgcctt gacttatttg 

atgaatcaac aacaacattt agaagcaact gtagagaaat tacaagaata cctaccgaaa 

gaaaacacgt ttaaattatt gcagcaacct ccgataacta cctcttatga agaattagag 

aaatttggta aagctaatgt tgctgagggt gttcttaaaa aagtgaatca tcaaattaac 
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agagttcatg 


aattagaaag 


acacattcaa 


agtaataatg 


aggaaataga 


gcgattaata 


480 


aagtgggaaa 


aattagaaat 


tgttcctgcg 


aatttagaac 


aattttcttt 


ctgtaaagga 


540 


aaagtcggaa 


caattccaag 


gactgaagat 


aatcgcttat 


acaatagtct 


tttagaaaac 


600 


aatattgaag 


ttcaagaaat 


attttctaat 


gatagagagt 


acggtgttgt 


tgttttctat 


660 


cagtctagtt 


actctataga 


ttttgatgaa 


tacttatttg 


aaccatttga 


ttattctaga 


720 


aaggaattac 


cgaagcagcg 


agtagtagat 


ttagatcaag 


aaaacatgca 


gttaataact 


780 


gaaaaagaga 


atattatcgc 


atcgttgcaa 


gattcaaaga 


aatatttgat 


agatttacaa 


840 


tggcaaatag 


actatatttt 


atctatctat 


gctcgtcaaa 


tctctaagaa 


taactttttg 


900 


tgcactccgc 


atctagttgc 


attagaagga 


tggatagaag 


aaactcgtat 


tttatatttt 


960 


ataaaagtta 


tggatgagca 


ttttggacat 


tctatttata 


tttatgaatc 


ggaaacattg 


1020 


acggataatc 


aagatgaaat 


acctatcaaa 


ttaacgaatc 


attctttaat 


tgaaccattt 


1080 


gaattattga 


cagaaatgta 


tgctctgccc 


aaatattatg 


agaaagatcc 


tacacctgta 


1140 


ttagcaccat 


tttactttac 


attttttgga 


atgatggttg 


ctgatttagg 


ctatggttta 


1200 


ctattgtttt 


taggaacaat 


gttagcatta 


aaaatttttc 


atctaccttc 


agcaactaag 


1260 


agatttttaa 


aattctttaa 


tatattaggg 


gtagccgttg 


caatttgggg 


tggaatctat 


1320 


ggctcatttt 


ttggatatga 


gttgccattt 


catctgatat 


ctacaacctc 


tgatgtcatg 


1380 


actatattag 


tagtgtcagt 


tgtgtttggg 


tttattacag 


tatttgcagg 


tttgttagct 


1440 


tcaggactac 


aaaaagtaag 


aatgaataaa 


tatgcagaag 


catataattc 


aggatttgcg 


1500 


tggtgtgtta 


ttctgcttgg 


cttgttattt 


attgctgttg 


gaatgttgat 


gcctgatatg 


1560 


agaccgttat 


ttgtattagg 


gaaatgggta 


tctattttta 


atgctgtggg 


gattttgatt 


1620 


gtttctatta 


ttcaaaccaa 


aagcttgtca 


ggtattggag 


caggattgtt 


taatctatat 


1680 


aacatttcat 


cttatatagg 


tgatttagtt 


agtttcactc 


gattgatggc 


attaggatta 


1740 


tctggagcaa 


gtatagcatc 


agctttcaat 


ttaattgttg 


gtttgtttcc 


gggaatattg 


1800 


gctaaactga 


caattggatt 


agtattattc 


attcttttac 


atgcgatcaa 


tatttttcta 


1860 


tcgttactat 


caggatatgt 


tcatggagca 


cgtctgatat 


ttgttgaatt 


ttttggtaag 


1920 


ttttatgagg 


gtggaggaaa 


accatttcaa 


cctttgaagg 


cttctgagaa 


atatattaag 


1980 


gttattacaa 


agaattaa 










1998 
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<210> 109 

<211> 915 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 109 



gatcaaaaat 


gtgggagtgt 


tgaaatgaag 


attataggta tcgatattgg 


cggaacaaca 


60 


attaaggcag 


atttatacga 


tgagtttgga 


acgagtttga atcatttcaa 


agagatagaa 


120 


acaattattg 


actatgattt 


gggaacgaat 


cagatattaa atcaggtctg 


tgatttaatt 


180 


ggtgagtata 


ctttaaatca 


ttcaattgat 


ggtgttggga tttccactgc 


tggagttgtt 


240 


aatgctaata 


ctggagaaat 


catctatgca 


ggctatacaa taccagggta 


tatcggagta 


300 


aactttactg 


ccgaaataga 


aaaacgtttt 


gggttgtata cttttgttga 


aaatgatgtt 


360 


aattgtgctg 


cattaggtga 


attgtggaag 


ggacaagcca aagataagaa 


aaatgtagta 


420 


atggttacta 


ttggaacagg 


tataggaggc 


agtattattg tcaacggaca aattgttaac 


480 


ggatttaact 


atactgctgg 


tgaagtaggt 


tatattcctg taggtaattc 


ggattggcaa 


540 


agtaaagcct 


caacaaccgc 


attgattcat 


ttatatcaaa aaaagagctt 


gaaaactaat 


600 


caaactggac 


gtactttctt 


cactgattta 


agatctggag ataaagttgc 


tgaagaaact 


660 


tttgaaattt 


ttgtagaaaa 


tctaacaaaa 


ggtttattaa cgatttctta 


tctacttaat 


720 


ccagaaattc 


tcatattagg 


aggtgggatt 


ctggatagta aggatatttt 


gttacctgaa 


780 


attcaaagtt 


ctttagctaa 


aaatgcaatg 


gataataggt ttttacctaa 


aaatcttgtg 


840 


gcagctacat 


taggaaatga 


agctggtcgt 


ataggagctg taaaaaattt 


cttagataga 


900 


atttctaata 


aatag 








915 



<210> 110 
<211> 930 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 110 

aggagaaatc tgatgaaaga tttaactaaa tacaaaggcg ttatccctgc attttatgct 60 
tgctatgatg aaaatggtga aattagccaa gatcgtgtaa aatctctggt acaatatttc 12 0 
attgacaaag gtgtaaaagg tatctatgta aatggttctt caggtgaatg tatttaccaa 180 
agtgtagaag atcgtaaaca aattattgaa gctgttatgg aagttgctaa aggtaaatta 240 
acagttatca accatattgc atgtaataac acgaaagata gtatcgaatt ggcaaaacat 300 
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tcagaaagtg 


ttggagtcga 


tgctattgca 


gctatcccac 


ctatttattt 


caaattgcca 


360 


gagtattcaa 


tcgcagcata 


ttggaatgca 


atgagtgaag 


ctgcgtcaaa 


tacagatttt 


420 


attatctata 


atattccaca 


attggcaggg 


gttgcgfctga 


ctggtagttt 


gtatgcaaca 


480 


atgcgtcaaa 


atcctcgtgt 


gattggagtt 


aaaaattctt 


ctatgcctgt 


acaagatatt 


540 


caaatgtttg 


tagctgcagg 


tggagaagat 


tacattgtat 


tcaatggtcc 


agatgaacaa 


600 


tatcttggtg 


gtcgcttgat 


gggagcagaa 


gctggtattg 


gtggtactta 


tggcgttatg 


660 


ccagatttgt 


tcttgaaatt 


ggaaagtttg 


attcaagaac 


gagatttaga 


tacagctaaa 


720 


aaacttcaat 


atgctatcaa 


tgaagttatc 


tataagatga 


tatcaggtaa 


ggcaaatatg 


780 


tatgctgtag 


caaaagaagt 


tttgcgtcta 


aatgaaaaac 


ttgatttagg 


ttctgttcgt 


840 


caacctttag 


aagcattggc 


agaaggtgac 


ttggaagttg 


caaaacaagc 


agcagaactt 


900 


attcaacaag 


cacgaaaaga 


atttttataa 








930 



<210> 111 

<211> 759 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 111 



gtgattggag 


gcaagaatat 


ggataaagat 


tatatattaa 


aagtgaaagg 


gctgtatcat 


60 


caatttttac 


taggaaataa 


taaaacgttg 


caagtgctga 


aaaatgtttc 


tctttctgct 


120 


tcgagaggag 


aatttataag 


tattctagga 


attagtggtt 


ctggaaagtc 


aactttatta 


180 


aaatgtattt 


ctagtttgct 


tgaaccaaca 


agtggggaag 


taattttaaa 


tggaatcaac 


240 


ccctataaaa 


tcagaaatgc 


aaaattgtca 


agtataagac 


gtaacgaagt 


atcttttata 


300 


tttcaagcat 


acaatttaat 


accttccctg 


ccggtaatag 


aaaatatagc 


acttcctttg 


360 


cgattatcac 


aaaaaaaatt 


aactattaaa 


aatgtagaaa 


acttactcaa 


aagaatgaag 


42 0 


tttaatgctg 


gcttaaacga 


ttttgttgga 


actctgtctg 


gtggagaaca 


acagaaggtt 


480 


gctatagcta 


gagcggttat 


tgctgatagt 


gatataatat 


ttgctgatga 


gccaactggg 


540 


gctttagaca 


gcgtttctcg 


tgaagtaatt 


tttgaattat 


tgagagagtt 


agtaggggcg 


600 


ggtaagtgtg 


taattatggt 


aacgcacgat 


atagaattgg 


cctcgaaaac 


tgatcgtgca 


660 


ttaatattga 


aagatggaaa 


aattttcaaa 


gaacttcata 


gacctagcgg 


ggaagagttg 


720 


tataaaatct 


tagaggtaca 


atcaactacg 


gaggaatag 






759 
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<210> 112 

<211> 1611 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 112 



cttatgaatt 


atttaaaatt 




tattaatggg gattttaata 




tttctatctt 






ttgtatatgc 


aggtagttta 










ctataagtat 


tctgggttgg 










aagcatcaat 


aactaaagat 










atctttcttc 


tgaaatggtt 






catctattat 




tgagattaat 


cgaagaaaac 




tattttagac 


aaatttttga 


aataatttct tcaatattgt 


tgttcatcgt 


ctctctaagt 




tttatgcttt 


atttaaactt 




ttgtattatc 


agcattgccg 






ctgtctttat 




cagctaatga gtactctaac 










atggttttaa gacattaaaa 




tctfcattctg 






aaaagttgga 


taaactagaa 






ttaatttgaa 




aattggttgc 


agtactaatt 




tcaggttttt 






attttgtaat 


ttatcataaa 










acgataaagt 


tttagggccg 










ccaaagattt 


aaggaaaccg 




tttttaaaat 


acttaagtgg 


agagaagaat tttatagacg 


ctgaacatga 


taataacgga 


960 


ctgtatactt 


catcaataga 


tgagatacac atgaaagatg 


ttgtatattc 


tattacacca 


1020 


gaaaataaat 


taagtattga 


cttctcattt aagtcaccat 


ttagggtatt attaacagga 


1080 


acttctggta 


gtgggaaaac 


aacgatttta aatttaatta 


atggttcttt 


aaagccacaa 


1140 


aaaggttatg 


taaatttgtt 


atcacatggg aaaaagagtt 


cagattcaat 


accaacagtt 


1200 


gatcagacac 


catatatttt 


tgacactact attcgtgaga 


acgtaacttt atttcaaaat 


1260 


gaatattttt 


cagatgatca 


gataattgag gtgttaaaaa 


aggtaaatct 


atatgaagaa 


1320 


ttagaaaaga 


tagatatact 


aaattatcaa tgtggtgaaa 


atggtagtaa 


tttgtctgga 


1380 


ggtcaaaaac 


aaaaaatagc 


tttagctaga gctctgatta 


gaaataataa agtgtactta 


1440 
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tttgacgaaa tatcagctaa tttagataat gataattcaa attccataca tgatattctg 1500 
ttcaatttag gtatttcatt tattgaagtt tcacatcatt atgacttaaa tgacaagaga 1560 
tacactgata tatataaatt ggaaaatgga acacttttca aaattaagtg a 1611 

<210> 113 

<211> 1953 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 113 



ggggtcatac 


ttgtgaatca 


gagtaatgaa 


atatttttaa 


atacaataca 










ttcatcattt 






ttctttgata 




tttaggagag 


ggcgatattg 






























aattgtcttt 




aatatacttt 






agtatcttac 


aggatgttag 


gatttcatct 




atatagtttt 
















gttatttatc 














tatgtatact 




cgctccaaag 


agtacggciag 


gccaagtaag 


ttattctata 


aatcttttga 






gaagaggaac 


tgtttgaaga 


aaacgaatgc 


tcttttcgat 


taaaaatagt 


ccacattgat 


600 


agcaataatt 


gttttgtaaa 


gtcagtagac 


tttcagaaag 


gcaggatttt 


tctatacagt 


660 


ttcgaccgga 


cgggttttgt 


tagacactct 


tacacagaaa 


cggtagcccc 


aacacctaga 


720 


gatatcgctc 


tatttagtac 


agaaaaagca 


cagtactttc 


ttgggctctc 


ttcaactgaa 


780 


gaaaagaaag 


atcaactaat 


tctaagggaa 


acttcttcgg 


gagacagagt 


ttctattact 


840 


ataccatatc 


aagatagggc 


tagaagagtt 


cactgtatag 


gacgttatat 


ccttttagac 


900 


tgttcaaacg 


cagcgaattc 


cgtcttttat 


cttgctacat 


ttaaaaataa 


tagtttgagg 


960 


gagttgagtg 


ttaataagat 


aacatttgat 


gaaaagacaa 


cactttatga 


aaattcattt 


1020 


tctgatcgtg 


ttcttctgtt 


tttagagagg 


agaacttttt 


acgaaaaaat 


acttagtttt 


1080 


gatttaatag 


aaaatacatt 


aaaaactgaa 


tttgaaagac 


ccatcataaa 


atcgaataat 


1140 


actaaatatt 


tttctaaggt 


gatttggact 


aaaaaagaca 


gttatgatgt 


aagcataccg 


1200 


atttctttat 


tttggaaaag 


tgaagatacg 


gatgagctcc 


caagaagaaa 


aaaatgtatt 


1260 
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ctaagtgtct 


atggggctta 


tggtaagaat 


gataattctg atttagatga aattatgfcta 


1320 


tctataatag 


atgcaggttt 


tatttatgct 


atagttcatg taagaggtgg aggctatttg 


1380 


ggtggtgaat 


ggtatcgctc 


aggaaaggcg 


ttaaacaaat ggaattctat cagagatttb 


1440 


attgagggag 


taaattattt 


aagggaaaat 


gatgtaattg acagtaagcg attaggttta 


1500 


ataacttcta 


gtgcaggtgg 


aataattgct 


ggtgcggtgt tgaacgagga aaaaaattta 


1560 


ttgcaaagca 


ttctcttatt 


ttcgccattt 


ataaatcctt atgatacact tcagaatcca 


1620 


aatgatcctc 


tttctaagac 


cgaaatagca 


gagtggggag atattagaga ccctgaagta 


1680 


aaggcttata 


taaagtcata 


ttcgcctatg 


caaaatattg aaaaggcacg agactcaaat 


1740 


actgttatag 


ttaatatttt 


gggtgagaaa 


gacccatata ttaataataa cgaagtgata 


1800 


gagtgghcaa 


agaaattaaa 


ttctattgga 


gttaaaagtt tgttgtattt aaacaaagca 


1860 


gctggtcatg 


gaggttttac 


tccatcagat 


gttctcttaa tgattgatac tttaaattat 


1920 


ttctttgaag 


aagtgggaag 


gaataactta 


tga 


1953 



<210> 114 

<211> 1449 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 114 



aacgggggtg 


accaagtgat 


tgatgggaaa 


cgattattat ttagtttgac catagtcagb 


60 


tatgccttga 


cgctagtaag 


tggaattgtg 


tatctgttta ataataataa tgttagctta 


120 


ctttctactt 


tattgttctt 


gttggttagt 


agcttaattg cttgttggaa tgatatcaag 


180 


tattacttaa 


tccattttat 


tttctattta 


accatttttg tatttctggt atcaagaccg 


240 


accattgatt 


attttaggga 


tggtgctttg 


gatacctatc atccaatagc ctatcgtttt 


300 


gcctttatag 


ttgtcatgat 


ttcgattctg 


ggcttgacca caggaggcat tctggctcgt 


360 


tacttcatag 


ctaggaagaa 


aataaaagta 


gcaaatatag gaaattctct aaaagaggtt 


420 


tatatcaagc 


ggttacgctt 


tgtatcacta 


ggagtttttc ttctaactta tcctttctat 


480 


ttcattcggt 


tatttgaacg 


gctcttgtat 


cgtttgcaga cttcctacta tgcctactat 


540 


gcaaattttg 


aaagtaaact 


gccttatttt 


acctacattt tgtctacctt tacggtctat 


600 


gcaatgtgta 


tgtatctggc 


aaccaagcca 


aagaaattgc aggccacagc agtgcttgtc 


660 


tcctttattg 


cagctaatac 


tattcatttg 


gcaattggga cacgaaatcc ctttatttta 


720 
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agtattttat 


ttgcttttgt 


ttattacttt atgcgggagc aaactgaaaa aggaaaatgg 


780 


attgggttta 


aagaaaagtt 


agcgattttt gtaggttctc ctattctcat gttagcgatg 


840 


ggagtactca 


attatgtacg 


ggataatgtc caagtttccc atacaggttt ctgggatatc 


900 


ttacttgact 


ttatctataa 


acaagggact agttttggtg ttctggctcg aggttttcta 


960 


tttaacagta 


gcctccctta 


ccgagatttc cgtaatttta cttttggtcc tgttcttgat 


1020 


tattttgcaa 


gggggagttt 


gggagccatt ttcggaggaa aagcctttga acatacaacc 


1080 


aatagtgtgg 


aactagctat 


tgatagtaat agttatgccc acaatctatc ctatcttgtc 


1140 


ttgaacaagg 


aatacttgaa 


agggcatggt atcggaagta gttatatcat ggagttgtat 


1200 


accgactatg 


gtatgattgg 


agtctttctg cttagtttct tactcggcgt attatttata 


1260 


gccatgctgc 


aagtagccta 


tcgctcaagg acaatcctat ttgctttatc cctactcatc 


1320 


ttgaataatc 


tattctttat 


gccaagaagc agcttttcag aaagtttctt caatttattt 


1380 


acaatgcaat 


tctggggaat 


tgttcttgtg attatatttg tagcaaaaat gcttacaaag 


1440 


gaaaactag 






1449 



<210> 115 

<211> 831 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 115 



tcaatgcaat 


cgaggaggaa 


agagatgaag 


aaaacaagct 


ctaaactctt 


tgtagtaccc 


60 


tacatgcttt 


ggattgcgct 


ctttgtattg 


gcacccttgg 


tcttgatttt 


cggtcaatcc 


120 


tttttcaaca 


tcgaaggcca 


gttcagttta 


gaaaattaca 


aatcttactt 


tgcgtcacaa 


180 


aacttgacct 


atcttaaaat 


gagtttcaac 


tcagtgcttt 


atgcaggcat 


tgtgaccttt 


240 


gtggcactgc 


ttatcagtta 


tccgacggcc 


ctctttttga cccgtctcaa acaccgtcaa 


300 


ctctggctca 


tgctgattat 


ccttcctacc 


tggatcaatt 


tgctccttaa 


agcctatgct 


360 


tttatcggga 


tttttggtca 


aaatggctct 


attaaccaat 


tcttggaatt 


tatcggaatt 


420 


ggttcacaac 


agttgctttt 


taccgatttt 


tcctttatct 


ttgtcgcaag 


ctacatcgag 


480 


ctccccttta 


tgattttgcc 


gattttcaat 


gtcttagacg atatggataa 


taatctcatc 


540 


aatgctagtt 


atgaccttgg 


tgcaactaag 


tgggagacct 


tccgtcatgt 


catcttccct 


600 


ctatctatga 


acggtgtgcg 


aagtggggtt 


cagtcggtct 


ttatcccaag 


tttgagtctc 


660 
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ttcahgctga cccgtttgat 


tggtgggaac 


cgcgttatca 


ccttggggac 


ggctattgag 


720 


cagaattttc 


taaccaatga caactatggt 


atgggttcaa 


ccatcggtgt 


gattctcatc 


780 


ctgaccatgt 


tcatcaccat 


gtgggtgact 










<210> 116 
<211> 771 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 116 
aaaggaaacc 


gtatgacaga 


tgcgatttta 


caggtatcag 


acctgtccgt 


ttattataat 


60 


aaaaagaagg 


ctttgaatag 


tgtttcccta 


tctttccaac 


ctaaggaaat 


tacagccttg 


12 0 


attggtccat 


ctggatcagg 


gaagtcaacc 


ctcctcaagt 


ctctcaaccg 


catgggagat 


180 


ctcaatccag 


aggtgaccac 


aactggatcc 


gtggtgtaca 


atggtcacaa 


catctacagt 


240 


ccgcgtacag 


atacggttga 


attacgtaag 


gaaatcggaa 


tggttttcca 


acaacctaat 


300 


cctttcccta 


tgactatcta 


tgagaatgtt 


gtctacgggc 


ttcgtatcaa 


tggaattaag 


360 


gataagcagg 


ttctggatga 


agccgtagaa 


aaagccttgc 


aaggtgcctc 


tatctgggat 


420 


gaggtcaagg 


atcgtctata 


tgattcagct 


attggattgt 


caggtggtca 


acagcagcgt 


480 


gtctgcgtgg 


cccgtgtctt 


ggcaactagt 


cctaaaatca 


tcctcttgga 


tgagccaact 


540 


tcggctttgg 


atccgatttc 


agctggtaaa 


attgaggaaa 


ccttgtatgg 


tctaaaagac 


600 


aagtacacca 


tgcttctggt 


aacccgttcc 


atgcagcaag 


cttcacgtat 


ctctgataag 


660 


acaggatttt 


tcctagatgg 


agatttgatt 


gaatttaatg 


ataccaagca 


gatgttcctt 


720 


gatccccaac 


acaaggaaac 


ggaagactat 










<210> 117 
<211> 912 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 117 
ttacgaaaga 


aagaggaaag 


aaaaattatg 


cgcgctaaga 


aattagataa 


acttgcaaca 


60 


gctgtcctct 


atacgattgc 


tagcatcatt 


gtgacaatct 


tggcttcctt 


gattctctat 


120 


atcttggttc 


ggggcttgcc 


ccatatctct 


tggtctttct 


tgactggaag 


gtcttctgct 


180 


tttcaagcag gtggtgggat 


tggcattcag 


ctttacaatt 


cctttttcct 


attggtcatt 


240 
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accttgatta 


tttctgtacc 


tctttctatg 


ggagctggga 


tttacttggc 


tgaatatgct 


300 


aaaaaaggtc 


ctgttaccaa 


ctttgtgcgg 


acttgtattg 


aaattttgtc 


ctctttacca 


360 


tcagtggtgg 


tgggtctctt 


tggttacttg 


atctttgtag 


tccagtttga 


gtatggattt 


420 


tcaatcattt 


caggtgcctt 


ggcctfcgaca 


gtctttaact 


tgcctcagat 


gacgcgtaat 


480 


gtagaggata 


gtttgaaaca 


cgttcaccat 


acccaacgtg 


aggctggtct 


ggctcttggg 


540 


atttcfccgct 




ggfctcatgtt 








600 
















gcagggcaat 


cggcgccagc 


tcttgactgg 


tctaactgga 


atatcctcag 


tgtgactagc 


720 


cccatctcta 


tcttccgtca 


agcagaaacc 


ttggctgtcc 


atatctggaa 


agtcaatagt 


780 


gaaggcacta 


ttccagatgg 


aaccattgta 


tcagcaggtt 


ctgccgctgt 


gctcctgatc 


840 


tttatcctga 


ttt.ttaactt 


tggagctcgt 


aagttcggaa 


gctatctaca 


caagaaatta 


900 


accgctgcct 


aa 










912 



<210> 118 

<211> 1800 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 118 



gcgaatttat 


atctaaaagg 


gatattaaag 


aaaggagata 


tgcttatgaa 


gatttacaaa 


60 


aaactatttg 


cttatgtcca 


agataagaaa 


tatcttgggg 


ttttggccat 


aattttttct 


120 


gctatatctg 


ctgcacttac 


agtatatgga 


tattatttaa 


tctacaaatt 


tctagataag 


180 


ttaataatta 


attcaaactt 


atccggtgca 


gagagtatag 


cattaaaatc 


tgttattaca 


240 


ctaacaagtg 


gagcgatatt 


ttattttgtc 


tcaggaatgt 


tttcacatat 


cttgggattc 


300 


aggcttgaaa 


caaatttaag 


aaaaagggga 


atcgatggtc 


tggaaaaagc 


aagttttagg 


360 


ttctttgact 


taaatccatc 


tggtcaaata 


agaaagatta 


tagatgacaa 


tgctgcacaa 


420 


actcatcagg 


tggtagcaca 


catgattccc 


gatagttctc 


aggcaataat 


cacacccgta 


480 


cttgtacttg 


cacttggctt 


tatagtaagt 


ataagagttg 


gcataatttt 


gcttgctctt 


540 


actataattg 


gtggcttaat 


tttaggggca 


atgatgggcg 


agcaagaatt 


tatgaagata 


600 


taccaagaat 


ccctatctaa 


actaagtgct 


gaaactgttg 


agtacgtgag 


aggaatgcaa 


660 


gttgtaaaaa 


tatttaaagc 


aaatgtagag 


tcttttaaaa 


gcttttataa 


ggcgataaaa 


720 
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gattactcaa 


agtatgctta 


tgattattcc 


ctatcttgta 


aaaggcctta 


tgttttgtat 


780 


caatggttat 


tttttggact 


gattgcaatt 


ttaattattc 


ctatagttta 


ttttatgact 


840 


agcttagcta 


gcgcaaaggt 


gattttactt 


gagcttatca 


tgattttatt 


tttatcagga 


900 


gttctctttg 


tttcattcat 


gagaatgatg 


tggtactcca 


tgtatatttc 


tcaaggaaat 


960 


tatgcagtag 


atactttaga 


ggcgctttac 


gaagatatgc 


aaaaagacaa 


attagtgcat 


1020 


ggtaatgfcca 


ataattttaa 


aaactataat 


atagaatttg 


agaatgt tag 


ctttgcttat 


1080 


aatgataaag 


ctgtcattga 


aaatttatcc 


tttaatttag 


aagaaggaaa 


gtcctacgca 


1140 


cttgtcggtt 


catctggatc 


aggcaaatca 


acagtagcaa 


aacttatatc 


aggtttttac 


1200 


aatgttaata 


aaggaagcat 


aaagataggc 


gggatagcaa 


taagtgaata 


ttctgacgaa 


1260 


gccttaatta 


aagccatttc 


ctttgttttt 


caagattcaa 


aattattcaa 


gaagagcatt 


1320 


tatgataatg 


tagcgttagc 


taataaagat 


gcgacgaaag 


atgacgttat 


gagagcctta 


1380 


aaattagcag 


gatgcgattt 


aatattagac 


aaattcccag 


aaagagaaaa 


tacaatcata 


1440 


ggctcaaaag 


gtgtttattt 


atccggtgga 


gaaaaacaaa 


gaattgcaat 


tgctagagca 


1500 


attttaaagg 


attccaaaat 


tattattatg 


gatgaagcat 


cagcatctat 


tgacccagat 


1560 


aacgagtttg 


aattgcaaaa 


agcttttaaa 


aatcttatga 


aggataaaac 


agttatcatg 


1620 


attgcacaca 


ggctatctac 


aattaaagac 


cttgatgaaa 


ttattgtcat 


ggatagtgga 


1680 


aaaattatag 


aaagagggtc 


tgacaaagaa 


ttaatgtcaa 


aagatacaag 


gtataagagc 


1740 


ctgcaagaga 


tgtttaacag 


tgcgaatgaa 


tggagggttt 


caaatgaaag 


agttttataa 


1800 



<210> 119 

<211> 1791 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 119 



tgtcaaaaga tacaaggtat 


aagagcctgc 


aagagatgtt 


taacagtgcg 


aatgaatgga 


60 


gggtttcaaa tgaaagagtt 


ttataaaaaa 


agatttgctc 


ttacagatgg 


aggagcaaga 


120 


aatttaagta aagcaacact 


ggcttcattt 


ttcgtttatt 


gtataaacat 


gcttcctgcc 


180 


atattactta tgatttttgc 


tcaggaagtt 


ttggaaaata 


tgggcaaaag 


caatggcttt 


240 


tatatagtat tctcagtttt 


gattttgata 


gcaatgtata 


ttttgctttc 


tatcgaatac 


300 


gataaattat ataacacaac 


ctatcaagaa 


agtgcagatt 


taagaataag 


gacagcggag 


360 



-94- 



WO 02/083855 



PCT/US02/11524 



aatttafccaa 


aattacctct 


atcttacttt 


tctaaacatg 


acatttccga 


catttcacaa 


420 


acaatcatgg 


ctgatattga 


aggcatagag 


catgcaatga 


gccactcaat 


accaaaggtg 


480 


ggcggcatgg 


tactgttttt 


cccattaata 




tgctagcggg 


caatgtcaag 


540 






fcccatcfcafct 


ttaagctfcta 


tatttatacc 


ttfcatctaaa 


600 














660 










ataatttatc 


















780 








atatttagct 


ttatatctct 


tgctgttgtg 








aattattaat 










tatttactag 








cafcctaaaga 


gggcttgatg 


960 


gaaatatttt 








aaattcaaaa 


tcaagattta 


1020 


caagaaggcg 


atgactatag 


cttaaaaaaa 


tttgatattg 


atctaaaaga 


tgttgagttt 


1080 






agttttaaafc 


ggtgtaagtt 


ttaaagctaa 


gcagggagag 


1140 










ctatcttgaa 


acttatatca 


1200 














1260 








attgttttcc 




tctctttaat 


1320 




























1440 
















gccagagcct 


tcttaaaaga 


tgcgccgata 


ttgatcttag 


atgagataac 


agcaagcctt 


1560 


gatgttaaca 


acgagaaaaa 


gattcaagag 


tctttaaata 


atttagttaa 


agataaaact 


1620 


gttgtaatca 


tttcacatag 


aatgaaatcc 


atagaaaatg 


cagacaagat 


agtagttctt 


1680 


caaaacggaa 


gagtagaaag 


cgaaggtaag 


catgaagagc 


ttttacaaaa 


atcaaaaatt 


1740 


tacaaaaatt 


taatagaaaa 


gacaaaaatg 


gcagaagaat 


ttatttatta 


g 


1791 



<210> 120 

<211> 675 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 120 
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gaatttattt 




ctacaatgga 




60 




aagatttagt 


aagcatcggt 


gtttfctggcg 


taatttattt 


tgccttcafcg 


120 






cttgattcca 


atattgttct 




gacagtatta 


180 






tgttatgtta 










ctatttatat 












300 


gttgtggttt 


tatcacttat 












tataafctcat 




tatgctttct 










tctttaatgc 


aaatgctttt 


agcaaaagaa 


aaatatatgg 


agtggtcttt 


gatgactatg 


480 


ggaaaagatt 


atgttgatgt 


attagaaaag 


ttaataactt 


atcctcacat 


ggctttagta 


540 


gccttaggtg 


ctttcttagg 


aggaattctt 


ggagcatata 


taggcaaggc 


tctattgaaa 


600 


aaacactttt 


caaatggatt 


atattgtgtg 










tgctatctga 


attaa 










675 


<210> 121 
<211> 636 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 121 
tgtattagaa 


aagttaataa 


cttatcctca 






gtgctttctt 




aggaggaatt 


cttggagcat 


atataggcaa 










attatattgt 


gtgggatact 


ttactccttg 


cctaatttta 








cctatagtta 


agatgttttt 


gagtatacct 


attgttatta 


gaatgtttat 


tttaccattt 




atggcagcaa 


gctttatgat 


aaagacctcg 










aagcttaaga 


tttcaaagaa 


tgtatccata 




ttatgttfcag 






tcttttaagg 


aggagaagaa 


aaacatcaaa 


atggctatga 


gagtaagagg 


gataaatttt 


420 


aaaaacccag 


tcaaatatct 


tgaatatgtt 


tctgtgccac 


tactcattat 


atcatctaat 


480 


atatcagatg 


acattgcaaa 


agcggcagaa 


acaaaggcaa 


tagaaaatcc 


aattgccaag 


540 


accagataca 


ttcgcgtaaa 


gatacagcta 


attgattttg 


tttatgtttt 


agcggttgct 


600 


ggacttattg 


tgggaggctt 


aatatggttg 


aaataa 






636 
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<211> 1173 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 122 



ttttgtttat 


gttttagcgg 


ttgctggact 


tattgtggga ggcttaatat ggttgaaata 


60 


aaaaatttaa 


gtcttgatta 


tggtgaagag 


catatattag atgatatatc actatccata 


120 


gccgagggag 


agtgcgtgct 


atttacagga 


aaaagtggaa atggtaagtc atctttaata 


180 


aattcaatca 


atggactagc 


tgtaaggtat 


gataacgcaa agacaaaggg cgaaataatt 


240 


attgatggta 


agaatataaa 


aaatttggaa 


ctttatcaaa tctcaatgct tgtttcaact 


300 


gtttttcaaa 


atcctaagac 


atattttttt 


aatgtcaata cgacattaga attattattt 


360 


tatttggaaa 


atatcggtct 


tgcaagagaa 


gagatggaca ggcgtttgaa ggatatactt 


420 


gagatattcc 


cgataaaaaa 


tcttttgaac 


agaaatatat ttaatctatc cggcggtgaa 


480 


aaacaaattc 


tttgcattgc 


agcttcttat 


atagcaggta caaagattat agttatggat 


540 


gagccttcat 


cgaatttaga 


tattaaaagc 


ataagtgttt tggcaaagat gctaaagata 


600 


ttaaaagaga 


aaggcataag 


cataattgtt 


gcagagcata gaatttatta tttgatggac 


660 


atagttgacc 


gtgtattttt 


aatagataaa 


ggaaagctta aaaaaactta tactagaagt 


720 


gaatttttaa 


agctagataa 


aaatgaatta 


aatgctttaa gtttaagaga taaagaatta 


780 


agtaaattaa 


aagttcctta 


tttaaaagaa 


ggtggagagt atcagataaa aaatcttagt 


840 


tacaaattta 


ctgatgatga 


gtgtttaagc 


ttaaaagata tttcgttcaa gcttgggaaa 


900 


atttatggca 


taataggatc 


caacggacga 


ggaaaatcaa cgcttttaag atgtttaata 


960 


ggtcttgaga 


aaaaatcaaa 


agaagaaatt 


tattttaagg gagagaagct atctaaaaaa 


1020 


gaaagactca 


aaaactcttc 


acttgttatg 


caagatgtaa atcatcaatt attcacagat 


1080 


gaagtattca 


acgagcttag 


attaggagta 


aagaattttg atgaagaaaa ggcgaaaatc 


1140 


attttaaacc 


ccaattattc 


accccaaatc 


taa 


1173 



<210> 123 
<211> 276 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 123 

tttgggttaa aagatttatg cctggacgaa tttattgaaa ggcatccgat gagtttatca 60 
ggagggcaaa agcaaaggct tgcaatagca tctgttatgt gcaagaattc tccatttgtc 120 
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ttttttgacg aaccttcaag tggtatggat tattccaata tgataaaaat atctgaactg 180 
attaataagt ataaaaccat ggataaaata atttttattg tttcccatga tatagaattt 240 
ttaaatgaag tggcagatga aatttttgaa ttgtaa 27 6 

<210> 124 

<211> 975 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 124 



a. a.a gg ac ga g 


agagctcaat 


ggatattaga 


ccgcaatcaa gtgatgaacfc tgatfcccjcaa 




a ga aga g t a a. 


ggagagacat 


gtcaaatagt 


ttaaaaggga ctttactaac agtfcgfcggct 




gg t a. 1 1 gc 1 1 


gggggttgtc 


aggaacgagt 


ggccaafcacc taatggcaca cggaatttcg 




gctctggtct 


tgactaactt 


gcgtctttta 


atcgctggtg gaattctcat gctcttggct 




tatgc tactg 


caaaggataa 


aatactggtc 


tttttaaagg atagaaagag tttgctgtct 




cttcttattt 


ttgctctgat 


tggtcttttt 


ctcaaccaat tcgcctatct gtctgctatt 


360 


caggagacca 


atgcgggaac 


agcgacggtg 


cttcagtatg tttgtcctgt cggaatttta 


420 


atttatagct 


gtatcaagga 


tagggtggca 


ccgacactgg gagagatagt ttccatcata 


480 


ttcgccatcg 


gaggaacctt 


cctgatcgca 


acacatgggc agttggacca gttatccatg 


540 


acacctgctg 


gtctgttctg 


gggtctcttt 


tctgccttga cttatgctct gtatatcatt 


600 


ttacccatag 


ccttgattaa 


aaagtggggg 


agcagcttgg tcattggtgt gggaatggtc 


660 


atagcaggtt 


tggtcgccct 


tccttttaca 


ggggttctac aggccgatat cccgactagt 


720 


cttgattttc 


tccttgcgtt 


tgcaggcatt 


atccttatcg ggactgtctt tgcctataca 


780 


gctttcctta 


aaggagccag 


tctgatagga 


ccggtcaagt caagcttgtt ggcttcaatt 


840 


gagccaatat 


cggcgatttt 


ctttgccttc 


ttaataatga atgaacaatt ttatcccatt 


900 


gattttcttg 


gtatggcaat 


gatattgttt 


gctgtaactt tgatttcttt gaaagattta 


960 


ttcttagaa'a 


aataa 






975 



<210> 125 

<211> 366 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 125 
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atatctcaaa acgcattatc 


gctgttttgg 


tacctaatat 


tgttgaagaa 


ggcgaaactc 


60 


cacaggaagc ctacgatttg gaagccatta 


tgtacaatcc 


aaaaatcgtc 


tctcactctg 


120 


ttcaagatgc tgctcttggc 


gaaggagaag 


gttgcctgtc 


tgttgaccgt 


aacgtgcctg 


180 


gctatgttgt tcgccatgcc 


cgcgttactg 


ttgactactt 


tgacaaagat 


ggagaaaaac 


240 


accgtatcaa actcaaaggc 


tacaactcca 


ttgttgttca 


gcatgaaatt 


gaccacatta 


300 


acggtatcat gttttacgat 


cgcatcaatg 


aaaaagaccc 


atttgcagtt 


aaagatggtt 


360 


tactga 










366 



<210> 126 
<211> 261 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 126 

gtttactctt taaaaaagat agaaatgaga gaaatcatgc tactgcaact attttcttta 60 
tatttcgaga gtttgatctt gaccaccatc cttgttctga tttttttagg gatttggatt 120 
ggtctgagag ccatgtcggg agttgataag acagccaggg ctcgccaagc ccatctctat 180 
gatatgatta tgattggagt cttggttgtc ccagtattat cctttgcggt tatgagttta 240 
attcttgttt tcaaggcata a 261 

<210> 127 

<211> 579 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 127 



gatgccagta gatggcgaac 


gcttggccta 


tcaaaaatta 


aagaaataat 


gcaaaagaag 


60 


tatgtaaaaa tcctctactc 


ctcaccaatt 


ggtattctat 


cacttgtagc 


tgatgaccat 


120 


tatttgtatg gaatttgggt 


tcaggagcag 


aagcattttg 


agaggggact 


aggagatgaa 


180 


acgatagaag aagttgttag tcatcctatt 


ttagacccag 


ttattgcttg 


cttagatgat 


240 


tactttaaag gcaagcctca 


ggatttatcc 


aacttgctct 


tggcgccaat 


cggaacgaat 


300 


tttgaaaaga gagtttggga 


ctatttacag 


ggcattcctt 


atggtcagac 


agtgacctat 


360 


ggacaaattg ctcaagacct 


gcaagtggct 


tctgctcaag 


caattggtgg 


agcagtggga 


420 


cgcaatcctt ggtctatcct 


agtaccttgt 


catcgtgtgt 


tgggagcagg 


caagcgtctg 


480 


acaggttatg ctgcaggagt 


ggaaaagaaa 


gcttggctct 


tggagcatga 


aggagtagat 


540 
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tttaaagata gaagcaatag aaggagaagc acatgttag 579 

<210> 128 

<211> 1455 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 128 







ttgtatcatt 




















gctttgattg 






tgcctafctag 


gcggatttat tgcttttaag 
































































fctfcaagattg 






c taggaaagg 












gafcgctagta 


acjcjctaacjga acacfcfcgata 




























attatccttt 


atccaagtcc 


tgtggt tgaa 


aatttagaag 


agatagcctt gccagtatct 




gctttctttg 


atgttatcca 


atcttcgtac 


ttactcgaaa 


aagatgcggc cttgtaccaa 


840 


tcttactttg 


ataagaaaca 


tcaaaaagtt 


gtcgctctaa 


cctttgatga tggtccaaat 


900 


ccagcaacga 


ccccgcaggt 


attagagacc 


ctagctaaat 


atgatattaa agcgactttc 


960 


tttgtgcttg 


ggaaaaatgt 


ttctgggaat 


gaggacttgg 


tgaagaggat aaaatctgaa 


1020 


ggtcatgttg 


ttggaaacca 


tagctggagc 


catccgattc 


tctcgcaact ctctcttgat 


1080 


gaagctaaaa 


agcagattac 


tgatactgag 


gatgtgctaa 


ctaaagtgct gggttctagt 


1140 


tctaaactca 


tgcgtccacc 


ttatggtgct 


attacagatg 


atattcgcaa tagcttggat 


1200 


ttgagcttta 


tcatgtggga 


tgtggatagt 


ctggactgga 


agagtaaaaa tgaagcatct 


1260 


attttgacag 


aaattcagta 


tcaagtagct 


aatggctcta 


tcgttttgat gcatgatatt 


1320 


cacagtccga 


cagtcaatgc 


cttgccaagg 


gtcattgagt 


atttgaaaaa tcaaggttat 


1380 
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acctttgtga ccataccaga gatgctcaat actcgcctaa aagctcatga gctgtactat 1440 
agtcgtgatg aataa 1455 

<210> 129 

<211> 744 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 129 



ctacggtttt 




tggtagaatc tttttacaaa aatacttggt 


aatcttgttt 




attcatgcta 


taataggaac 


aattactttt aggaggtgca gtatgtctta 


tttatttgag 




atattaccga 


gtttactgaa 


tggtgcgagc acgactgtac aggtctttgc 


actggtcttg 


180 


ctattttcga 


ttcccttggg 


cgttttgatt gcctttgcct tgcaagtcca 


ttggaagccc 


240 


ctccattatc 


tgattaacat 


ttacatctgg gttatgcgag gaaccccctt 


actcttgcaa 


300 


ctgattttta 


tctattatgt 


gctcccaagt attgggattc gtttagaccg 


ccttcctgca 


360 


gctattattg 


cctttgttct 


caactatgca gcttactttg cagaaatttt 


ccgtggggga 


42 0 


attgacacta 


ttccaagagg 


acagtatgag gccgccaagg tcttgaagtt 


tagccctttt 


480 


gacagagtgc 


gctatattat 


cttgccccaa gtgaccaaga tcgttcttcc 


tagtgtcttt 


540 


aatgaagtta 


tgagtttggt 


caaggatact tctttggtct atgctctcgg aatttcagac 


600 


cttatcttgg 


ctagtcgaac 


agctgctaac cgcgatgcta gtctagttcc 


tatgttcttg 


660 


gcaggagcca 


tttatttgat 


tttgattggg attgtgacaa ttatttccaa aaaagttgag 


720 


aagaagtata 


gttattatag 


atag 




744 



<210> 130 
<211> 717 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 130 

atggaagaaa gtattaatcc aatcatctct attggtcctg ttatcttcaa tctgactatg 60 
ttagccatga ctttgttgat tgtgggagtt atttttgtct ttatttattg ggcaagccgc 120 
aatatgacct tgaaacccaa aggaaagcaa aatgtacttg agtatgtcta tgactttgtt 180 
attggattta cagaacctaa cattggttcg cgctacatga aagattactc actctttttc 240 
ctttgtttat tccttttcat ggtgattgcc aataaccttg gcttaatgac aaagcttcaa 300 
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acgatcgatg ggactaactg gtggagttcg 


ccaaccgcta 


atttacagta 


fcgacttaacc 


360 


ttatcttttc ttgtcatttt gttgacacat 




ttcgtcgtcg 


tggatttaaa 


420 


aaaagtataa aatcttttat gagtcctgtt 


tttgtcatac 


cgatgaatat 


cttggaagaa 


480 


tttacaaact tcttatcttt ggctttgcgg 


atttttggga 


atatctttgc 


aggagaggtc 


540 


atgacgagtt tgttacttct tctttcccac 


caagctattt 


attggtatcc 


agtagccttt 


600 


ggagctaatt tggcttggac tgcattttct 


gtctttattt 


cctgcatcca 


agcttatgtt 


660 


tttactcttt tgacatctgt gtatttaggg 


aataagatta 


atattgaaga 


ggaatag 


717 



<210> 131 

<211> 1695 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 131 



gatataatat 


tatggattat 


caacaaggag 


gaaaaacttt 


tgagtgaaaa 


gtcaagagaa 


60 


gaagagaaat 


taagctttaa 


agagcagatt 


ctgagagatt 


tagaaaaagt 


aaaaggctat 


120 


gatgaagttc 


tgaaagaaga 


tgaggcagta 


gttcgcactc 


ctgcaaatga 


accttcaact 


180 


gaagaactca 


tggctgattc 


cttgtcaacg 


gtagaggaga 


ttatgagaaa 


agctcctacc 


240 


gtgcctactc 


acccaagtca 


aggtgtacca 


gcttctccag 


cagatgagat 


tcaaagagaa 


300 


actcctggtg 


ttccaagtca 


tccaagtcaa 


gatgtacctt 


cttctccagc 


ggaagaaagt 


360 


ggatcaagac 


caggtccagg 


tcctgttaga 


cctaagaaac 


ttgaaagaga 


atacaatgaa 


420 


accccaacaa 


gggtagctgt 


ttcctatacg 


acggcagaga 


aaaaagcaga 


acaagcaggt 


480 


ccagaaacac 


ctacgcctgc 


tacagaaaca 


gtggatatca 


tcagagatac 


atcacgtcgt 


540 


agccgtagag 


aaggagcaaa 


acccgttaag 


cctaagaaag 


agaagaagtc 


acatgtgaaa 


600 


gcttttgtga 


tttcattcct 


tgtattcctt 


gccttgctct 


cagcaggtgg 


ttactttggt 


660 


taccagtacg 


tgctagattc 


cttattacct 


atcgatgcta 


attctaagaa 


atatgtgacg 


720 


gttggaattc 


cagaaggttc 


aaacgttcaa 


gaaatcggta 


cgacgcttga 


aaaagctggt 


780 


ttggtaaagc 


atggtctgat 


ttttagtttt 


tatgccaagt 


ataaaaatta 


taccgacttg 


840 


aaagcaggtt 


actacaattt 


gcaaaagagt 


atgagtacag 


aagacttact 


caaagagttg 


900 


caaaaaggtg 


gaacagatga 


accgcaagaa 


cctgtacttg 


cgactttgac 


aattccagaa 


960 


ggttatacct 


tggatcagat 


tgctcaagct 


gtgggtcaat 


tgcaaggtga 


cttcaaagag 


1020 
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t c 1 1 1 gac ag 


cggaggctfct 


cttggctaaa gttcaagatg 


agacgtttat cagtcaagca 


1080 


gtagcgaaat 


atcctacttt 


actggaaagt ttgcctgtaa 


aagacagcgg tgcgcgttat 


1140 


Cgtt fcggaag 


gatacctttt 


cccagctaca tactctatca 


aggaaagcac aacfcattgag 


1200 








tatctcctta ctafcagfcact 


1260 
















tcgtaagctc attgcaggtg 


tattctacaa tcgtttgaat 


1380 










1440 


aatatcagtc 


tagctgagga 


tgttgcgatt gataccaaca 


ttgattcacc ttataatgtt 


1500 


tataaaaatg 


taggtctcat 


gcctggtcca gtcgatagtc 


caagtctgga tgcgattgag 


1560 


tcaagcatca 


atcaaactaa 


gagcgataac ctctactttg 


tagcagatgt cacagaaggc 


1620 


aaggtctact 


atgctaacaa 


tcaagaagac cacgaccgca 


atgtcgctga acatgtcaac 


1680 


agcaaattaa 


actaa 






1695 



<210> 132 

<211> 879 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 132 



tcgtcgttta 


caggaggaaa 


tttaacaggg caattgactg aaaagattca agaacatgaa 


60 


ttaattaaga 


ctaaccaagc 


agagaaaagt gtacaggatg 


ttttggataa ttgtattgaa 


120 


agggtacaaa 


acaattcact 


gaaatcagat agggttactt 


cttttgagac cccgtttgct 


180 


ctcttattta 


tctttgcgac 


tatagctgtg atgctaacct 


atgggggtta tcgggtcagc 


240 


gcaggatata 


tatctgtggg 


aaccttggtt tcgtttttga 


tttacctctt tcaattactt 


300 


aatcctatta 


gtaatatagc 


taattttgta actgtttatt 


ctaggagcaa gggatcttca 


360 


gttgcactgg 


agaacttgct 


tgcagttcct aaagaaaaat 


ttgagggagg aaaatcggta 


420 


tcaggacgag 


ggttgaattt 


taaccatgtc tattttggtt 


atgatgaaaa tcgacctgtc 


480 


ttaaaggata 


ttacttgttc 


aattttcaag gggcaaaaaa 


ttgcttttgt tggaccatct 


540 


ggatcaggaa 


aatcaacgat 


tgtgcgtttg ttagagcggt 


tttataaacc gctttcagga 


600 


gatattctaa 


tggagcaatc 


aagtatatat gattttaact 


taaaagaatg gagaagtaaa 


660 


atcgcttggg 


tttcacaaaa 


taatgcagtc ttatctggca 


gtattcgtga caatctttgt 


720 
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ctcggtttga atcgcttagt aactgatgat gaattgatga aagtgctaga cttagtatca 780 
ctaggtgatg agattcgctc catgaaagag ggactagata ctgaagttgg tgaacgcgga 840 
cgactcttgt caggggggcg aacgaaagac ttcaaatag 879 



<210> 133 

<211> 555 

<212> DNA 

<213> Streptococcus pneumoniae 



<400> 133 



ggagtattta tgaaattaaa attattaaga 


gtagatacta 


aggtgattat 


ggggagtttc 


60 


ttacttgttc tgtctagtct acttgctttg 


ttgcttcccc 


ttatcttaaa 


ggatttaata 


120 


gatgggagtt ctattgaaaa tataggctcc 


aaagtatttc 


aatcgttttt 


gatttttatt 


180 


ggtcaagcct tgttttcttc tattggttac 


tatctgttta 


gtcaatcggg 


tgaaaaaaag 


240 


atagcaaaaa tcaggaaaaa agtgatagag 


gggttgattt 


atgtagagaa 


atccttcttt 


300 


gataagagcc aaagtgggga gttgacttct 


gccattgtca 


atgacacgag 


tgtcattcgt 


360 


gagtttttaa ttacgacttt cccaaatatt 


attctgagtt 


tagttatggt 


acttggttcc 


420 


attgtagtct tatttagtct tgattggaat 


ctttctctac 


ttttattcat 


cactcttcct 


480 


tgtatgatgt ttattatctt gcccctttcc 


aatatcagtg 


aaaagtatag 


tcgtcgttta 


540 


caggaggaaa tttaa 








555 



<210> 134 

<211> 1989 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 134 



aacaaatatt taaagcagga ggttccggaa 


atgaaaaagt 


ctaagagcaa 


atatctaacc 


60 


ttggcaggtc ttgtcctggg tacaggagtt 


ttattgagcg 


cgtgtggaaa 


ttctagcacg 


120 


gcgtcaaaaa cctacaacta tgtttattca 


agtgatccat 


ctagcttgaa 


ctatctagca 


180 


gaaaaccgcg cagcaacatc cgatattgtt 


gcaaatttgg 


tagacgggtt 


attagaaaat 


240 


gaccaatatg ggaatattat tccatcatta 


gcagaggatt 


ggactgtttc 


tcaggacggt 


300 


ttgacctata cctacaaact tcgtaaggat 


gccaagtggt 


ttacttctga 


gggagaagaa 


360 


tatgcgcctg taactgccca ggattttgtg 


acaggtttgc 


aatatgcagc 


tgataaaaaa 


420 


tcagaagcct tgtatctagt gcaggactct 


gttgctggtt 


tggatgacta 


tatcactggt 


480 
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aaaacaagcg 


acttttcaac 


tgtcggtgtc 


aaggcacttg atgaccaaac ggttcaatat 


540 


actttggtta 


aaccagaact 


ttactggaat 


tcaaaaacac 


ttgcaacgat actttttcct 


600 


gttaatgcag 


atttcctgaa 


atcaaaaggg 


gatgattttg ggaaggcgga tccatctagt 


660 


attttgtaca 


atggaccttt 


cttgatgaaa 


gcacttgtct 


caaaatctgc tattgaatat 


720 


aagaaaaacc 


ctaattactg 


ggatgctaag 


aatgtctttg tagacgatgt gaaattgacc 


780 


tactatgatg 


gtagcgacca 


agaatcactg 


gaacgtaatt 


ttacagctgg tgcttatact 


840 


acggctcgtc 


tttttcctaa 


cagctccagc 


tatgaaggga 


ttaaagaaaa atacaaaaac 


900 


aatatcatct 


atagtatgca 


aaattcaact 


tcatatttct 


ttaattttaa cctagatagg 


960 


aagtcttaca 


attatacttc 


taaaacaagt 


gacattgaaa 


agaaatcgac tcaggaagca 


1020 


gttctcaata 


aaaacttccg 


tcaggctatc 


aattttgctt 


ttgacagaac atcttatggg 


1080 


gctcagtctg 


aagggaaaga 


aggtgcaaca 


aagattttgc 


gtaacctagt ggttcctcca 


1140 


aactttgtca 


gtatcaaggg 


aaaagacttt 


ggtgaagttg 


tagcctctaa gatggtcaac 


1200 


tatggtaagg 


aatggcaagg 


tatcaacttt 


gcggatggtc 


aagaccctta ctacaatcct 


1260 


gagaaagcca 


aggctaagtt 


tgcggaagct 


aagaaagaac 


tcgaagcaaa gggtgttcaa 


1320 


ttcccaatcc 


acttggataa 


gactgtggaa 


gtaacagata 


aagtaggcat acaaggagtt 


1380 


agttctatca 


aacaatcaat 


tgaatctgtt 


ttaggttctg ataatgtagt gattgacatt 


1440 


cagcaattaa 


catcagatga 


gtttgacagt 


tcaggctact 


ttgctcaaac agctgctcag 


1500 


aaagattatg 


atttatatca 


tggcggttgg 


ggacctgatt 


atcaagaccc gtcaacctat 


1560 


ctcgatattt 


ttaatactaa 


tagtggagga 


tttctgcaaa atcttggact agagcctggt 


1620 


gaggccaatg 


acaaggctaa 


ggcagttgga 


ctggatgtct 


atactcaaat gttggaagaa 


1680 


gctaataaag 


agcaagatcc 


ggccaaacgt 


tatgagaaat 


atgctgatat tcaagcttgg 


1740 


ttgattgata 


gttctttagt 


tcttccaagt 


gtttcgcgtg ggggaacacc atcattgaga 


1800 


agaaccgtac 


catttgctgc 


tgcctatggt 


ttaaccggta 


caaaaggggt tgaatcatat 


1860 


aaatacctca 


aagtacaaga 


taagattgtc 


acaacagacg aatatgcaaa agccagagaa 


1920 


aaatggttga 


aagaaaaaga 


agaatccaat 


aaaaaagccc 


aagaagaatt ggcaaaacat 


1980 


gtcaaataa 










1989 
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<211> 1647 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 135 



ttatcaaaat 


tgaatgagga 


atctatgtcg 


cacgaaaaca atcaccagca ggcccagatg 


60 


ttacggggga 


ctgcttggct 


aacggctagt 


aactttatca gtcgcctact cggggctgtt 


120 


tacattatcc 


cttggtacat 


ctggatgggg 


gcttatgcag ctaaggcaaa tggtctcttt 


180 


accatgggtt 


acaatatcta 


tgcttggttc 


ttgttggttt caacagcggg gattccagtt 


240 


gcggtggcca 


agcaagttgc 


caagtataat 


accatgcgag aagaagagca tagctttgcc 


300 


ctgattcgga 


gcttcttagg 


ctttatgaca 


ggactaggcc tggtttttgc tttagtcttg 


360 


tatgtctttg 


ctccttggct 


agcagacttg 


tctggcgtgg gcaaagactt gatcccaatc 


42 0 


atgcaaagct 


tggcttgggg 


agtcttgatt 


ttcccgtcta tgagtgttat ccgaggattt 


480 


ttccaaggga 


tgaataacct 


caaaccctat 


gccatgagcc aaattgctga gcaggtcatt 


540 


cgtgttatct 


ggatgctcct 


agcaaccttt 


atcattatga agctcggttc aggagattat 


600 


ctagcagccg 


ttacccaatc 


aacctttgct 


gcctttgtcg gtatggtagc cagttttgca 


660 


gtcttgattt 


atttccttgc 


ccaagaaggt 


tcactcaaaa gaatctttga aacaggagat 


720 


aagattaaca 


gtaagcgtct 


cttggttgat 


accattaagg aagccattcc ttttatcctg 


780 


acagggtctg 


ccatccagct 


cttccagatt 


ttggatcagc tgacctttat caatagtatg 


840 


agctggttta 


ccaactacag 


caatgaggac 


ttggttgtca tgttttctta tttctcagcc 


900 


aatcctaata 


aaatcacgat 


gattttgatt 


tctgtagggg tttcgattgg gagtgttggt 


960 


ttgccacttt 


tgacggaaaa 


ctatgtcaag 


ggggacttga aagcagcttc tcgtctcgtt 


1020 


caggacagtc 


tcaccctact 


ctttatgttc 


ttgctaccag caacggttgg agtggttatg 


1080 


gtaggagaac 


ctctttatac 


ggtcttctat 


ggtaagccag atagtttggc tctgggctta 


1140 


tttgtctttg 


cagttttgca 


gtctattatt 


ttaggcttgt acatggtctt gtctccaatg 


1200 


cttcaggcca 


tgttccgcaa 


ccgcaaggcc 


gttctctatt ttatctatgg ttctattgcc 


1260 


aagctagtct 


tgcaactacc 


taccatcgcc 


ctcttccaca gttatggtcc tttgatttca 


1320 


acaaccattg 


ctctcatcat 


tcctaacgtc 


ttgatgtatc gggatatttg taaagtaact 


1380 


ggtgtcaagc 


gcaaggtgat 


tttgaagcga 


accattttaa tcagtttgct gaccctagtc 


1440 


atgtttctgt 


taataggaac 


catccagtgg 


ctgttaggat ttttcttcca accaagtgga 


1500 
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cgtttgtgga gcttctttta tgtagctctt gtcggtgcca tggggggtgg actttatatg 1560 
gttatgagtc tgcgtaccta tttattagat aaggtaatag gaaaagccca agcagatcgc 162 0 
ctgcgagcaa aatttaagct ttcgtaa 1647 

<210> 136 

<211> 639 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 136 



gcaaatttct 


tgcaagttct 


tttgttttgt tgtaatatat tttataacaa cgagagagtt 


60 


ctcgaaattt 


taagaaaaag 


gagacacatc atgtctaaaa aagtattatt tatcgtcgga 


12 0 


tcactacgtc 


aaggttcttt 


caaccaccaa atggcgctcg aagctgagaa agcacttgct 


180 


ggtaaagcgg 


aagttagcta 


ccttgattat tcagcccttc ctctcttcag ccaagatttg 


240 


gaagttccaa 


cacatccagc 


tgtagctgct gctcgtgaag cagttctcgt tgcggatgct 


300 


atctggattt 


tctctccagt 


ctacaacttc tctatccctg gtacagtgaa aaacttgctt 


360 


gactggctat 


ctcgtgccct 


tgacttgtct gatacacgtg gcgtttctgc ccttcaagac 


420 


aagtttgtca 


cagtatcatc 


tgtagccaat gcagggcacg atcaactttt cgctatctac 


480 


aaagacctct 


tgccatttat 


ccgtacacaa ggcgttggtg atttcactgc tgcacgtgtt 


540 


aatgactctg 


cctgggcaga 


cggaaaattg gttcttgaag aaacagtcct aaactcactt 


600 


gaaaaacaag 


ctcaagactt 


ggtcgaagct atcaagtaa 


639 



<210> 137 
<211> 1902 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 137 

agccatccat gcttacctga gggagaaaaa atgagtgatt ttatcgttga aaaactaagt 60 
aaatccgttg gtgacaagac cgtttttagg gatatttcct ttattatcca tgacttagac 120 
agaattggtt taatcggtgt caatgggact ggcaagacca cccttttgga cgtcctttct 180 
ggtgtttctg gatttgatgg ggatgtcagt cctttttcag ctaaaaatga ttaccagatt 240 
ggttacttga ctcaggatcc tgattttgat gatagaaaga cagttttgga tacggttcta 3 00 
tctagtgaac tcaaggaaat ccagctcatt cgtgagtatg aattgattat gctcgactat 360 
agtgaggaca agcaggcgcg tttggaacgt gtcatggcag agatggactc tctccaagct 420 
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tgggaaatcg 


aaagtcaggt 


caagaccgtt 


cttagcaaat 


tgggcattca agacttatct 


480 


actcctgttg 


gggaattgtc 


aggtggtctg 


agaagacggg 


tacagttggc acaagtctta 


540 


cttggcaacc 


acgacctctt 


gcttttggat 


gagccgacca 


accatctgga tattgcgatt 


600 


attgagtggc 


tgaccctctt 


tttgaaaaat 


tctaagaaga 


ccgtcctttt tatcactcac 


660 


gatcgttatt 


tcttagacgc 


tttgtcaaca 


cggattttcg 


agttggatcg tgcaggcttg 


720 


accgagtacc 


agggaaatta 


ccaggactat 


gttcgcctaa 


aggcggaaca ggatgagcgc 


780 


gacgcggctc 


ttcttcacaa 


aaaagaacaa 


ctctacaaac 


aagaattggc ctggatgcgc 


840 


agacaaccgc 


aggcgcgtgc 


gaccaagcaa 


caagctcgta 


tcaatcgttt ccatgatctg 


900 


aaaaaggaag 


tttcaggcag 


tagtgctgag 


acagacttga 


ctatgaactt tgaaaccagt 


960 


cggattggga 


agaaagtcat 


cgagtttcag 


gatgtttcct 


ttgcctatga aaataagccc 


1020 


attttgcaaa 


attttaatct 


cttagttcag 


gctaaagacc 


gtattggaat tgttggggac 


1080 


aatggtgttg 


gaaaatcaac 


cctacttaac 


ctgattgcag 


gaagtcttga gccgacagca 


1140 


ggacaagttg 


tgattgggga 


aactgttcgc 


atcgcctatt 


tctctcaaca aattgagggt 


1200 


ttggatgaaa 


gcaagcgtgt 


gatcaattac 


ctgcaggaag 


tggcagagga ggtcaagacc 


1260 


agtggtggtt 


ctacgacttc 


catcgctgag 


ttgctggagc 


aattcctctt cccacgttcg 


1320 


acgcatggga 


ctttgattga 


gaaattgtca 


gggggtgaga 


aaaaacgtct ttatctcctc 


1380 


aaactgcttt 


tggaaaaacc 


aaatgttctt 


cttttagacg 


agccaaccaa tgacctagat 


1440 


attgcaactt 


tgacagtctt 


agagaatttc 


ttgcaaggtt 


ttgcaggtcc cgttttaaca 


1500 


gtcagtcacg 


accgctattt 


cttggataag 


gtagcgacca 


agattctcgc ttttgaggat 


1560 


ggcaagattc 


gtcctttctt 


tggtcattac 


accgactatc 


ttgatgaaaa agcttttgaa 


1620 


acagatatgg 


ccaatcaagt 


gcaaaaggcc 


gaaaaggaaa 


aagtggtcaa ggttcgagaa 


1680 


gacaagaaac 


gcatgaccta 


ccaagaaaag 


caggagtggg 


caagtattga aggtgatatt 


1740 


gaaaccttgg 


aaaaacgtat 


cgctgctatt 


gaagaggaaa 


tgcaggctaa cggctctgac 


1800 


tttggtaagc 


tggctactct 


ccaaaaagaa 


ttggatgaga 


aaaatgaagc actccttgaa 


1860 


aaatacgaac 


gctatgagta 


tctcagtgaa 


tttgatagtt 


aa 


1902 



<210> 138 
<211> 579 
<212> DNA 
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<213> Streptococcus pneumoniae 
<400> 138 

tatactaagg tagtaatcat taagaagtgg ttacaaaaaa taatgaatga ggtaaagaaa 60 

atggtagaat tgaaaaaaga agcagtaaaa gacgtaacat cattgacaaa agcagcgcca 12 0 

gtagcattgg caaaaacaaa ggaagtcttg aaccaagctg ttgctgattt gtatgtagct 180 

cacgttgctt tgcaccaagt gcactggtat atgcatggtc gtggtttcct tgtatggcat 240 

ccaaaaatgg atgagtacat ggaagctctt gacggtcaat tggatgaaat cagtgaacgc 300 

ttgattacac tcggtggaag cccattctct acattgacag agttccttca aaatagtgaa 360 

atcgaagaag aagctggtga ataccgtaat gttgaagaaa gcttggaacg tgttcttgtt 42 0 

atctaccgtt acttgtcaga acttttccaa aaaggtttgg atgtcactga tgaagaaggt 480 

gacgatgtga caaacggtat ctttgcaggc gctaaaactg aaacagataa aacaatttgg 540 

atgcttgcag ccgaacttgg acaagcacct ggtttgtaa 579 

<210> 139 

<211> 1083 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 139 



tctagcaatc 


ttttgtttgg 


gcttatcggc 


tgcatttatg gggcgtttgg 


tagaaaaatt 


60 


tggtccgaaa 


gtcatgggaa 


gtctatctgc 


ttttctatac gcaggtggaa 


atatcttaac 


120 


aggatttgca 


atagaccgtc 


agagctgtgg 


ttgttgtatc tcgcttatgg 


cattttaggt 


180 


gggcttggtt 


tgggagcagg 


ctatattacc 


cctgtgtcga cgattataaa 


atggtttcct 


240 


gataaacgtg 


gtctcgcaac 


aggtttagcg 


attatggggt ttggttttgc 


ttctttattg 


300 


actagtccca 


tagcgcaaca 


cctcatcgca 


ggggtagggc ttgtagaaac 


tttttatatt 


360 


ttaggagcaa 


gttactttat 


tatcatgctc 


ctagcttcac aattcattaa 


gcgtccaaat 


420 


gagcaagagc 


ttgcaatttt 


atcttcttca 


gggaaagaaa aaacagcctc 


tttgacgcaa 


480 


ggaatggctg 


caaatcaggc 


tctaaaaagc 


aatcggtttt atatgctttg 


gattattttc 


540 


tttatcaaca 


tagcttgtgg 


tttaggctta 


atttcagcgg catcgccaat 


ggcacaggag 


600 


atggctggct 


tgtctacaag 


tcatgcagca 


gtaatggtgg gtgttttggg gattttcaat 


660 


ggatttggtc 


gcttgctctg 


ggcgagtttg 


tctgactata tcggtcgccc 


tctaaccttt 


720 


agtatattac 


tgcttgtcaa 


tcttttcttt 


tctctctcac tttggctctt 


tacagattcc 


780 
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gttttatttg tagttgctat gtctattttg atgacttgct atggagctgg tttttctttg 840 

attccagctt atctcagtga tatttttgga accaaggaat tggccgctct gcatggttat 900 

attttaacag cttgggcaat ggctggttta gcgggaccta ttttattagc agagacttat 960 

aaaatggctc attcgtacac acaaaccttg ttcgtttttc tcattttata cagtatcgcc 102 0 

ttggctttgt cttattatct aggtcgttca atcaaaaaag aaagtcaaaa agcgcttaca 1080 

tga 1083 

<210> 140 
<211> 468 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 140 

gacaatatga agcaaacaaa aacaactaaa atcgcccttg tatccctatt aaccgccctt 60 

tctgtggttc taggttattt cttaaaaatc ccaacaccta caggaattct aactctttta 12 0 

gatgctggtg tcttctttgc ggccttttac tttggtagtc gtgaaggagc ggtagtcgga 180 

ggactagcaa gtttcttgat tgacctctta tcaggctacc ctcagtggat gttctttagc 240 

ttggtcaacc atggcttgca gggatttttc gcaggattta aaggaaaaag tcagtggtta 300 

ggccttattt tagcaactat tgccatggta ggaggctacg ccttgggttc tgctttgatg 360 

aatggctggg cagcagccct cccagaaatt ctaccgaatt ttatgcaaaa tatggtaggg 42 0 

atgattgtag gatttattct tagtcaaagt atcaagaaga ttaagtaa 468 

<210> 141 
<211> 684 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 141 

gagaagatga tttcaaagag attagaattg gtagcttcct ttgtgtcaca gggggctatt 60 

ttactagatg tgggaagtga ccatgcttat ctgcctatcg agttggttga gagaggccaa 12 0 

atcaaaagcg ctattgcagg tgaggtggtg gaaggtccct atcagtctgc ggttaaaaat 180 

gttgaggctc acggcctaaa ggagaaaatc caagtccgtt tagccaatgg cttggcagct 240 

tttgaagaga ctgaccaagt gtctgtcatt accattgctg gcatgggtgg tcgtttgatt 300 

gctaggattt tagaagaagg tttggggaag ttagctaatg tagagcgttt gatcctccag 360 
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cccaataatc 


gtgaagacga cttgcgtatc 


tggctacagg 


atcatggatt 


ccagattgta 


420 


gcagaaagca 


tcttagaaga agctggaaag 


ttttatgaga 


ttttggtggt 


ggaagcagga 




caaatgaagc 


tatcagccag tgatgttcgc 


tttggtccct 


tcttgtccaa 


agaagtcagt 


540 


ccagtatttg 


tccaaaaatg gcaaaaagaa 


gctgagaagc 


tagagttcgc 


cctcggacaa 


600 


atcccagaaa 


aaaatctgga agaacgtcaa 


gttctagtag 


ataagattca 


agctatcaag 


660 


gaggtgctcc 


atgttagcaa gtga 








684 



<210> 142 
<211> 336 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 142 

gaaaaaattt tggagggtat ccgtatgaaa attgttggtg ttgcagcttg tactgtggga 60 
attgcccaca cttatattgc acaggaaaaa ttagagaatg ccgcaaaggt agctggacat 120 
gtgattcatg ttgagactca ggggacaata ggggtagaaa atgaattgag tcaagagcag 180 
attgatgcag cggatgtagt tattttagca gttgatgtta agatttctgg tatggaacgc 240 
tttgagggta aaaagattat caaggttcca acagaagtgg cagtcaaatc tcccaataaa 300 
ctgattgcta aagctgttga gattgttacg aaataa 336 

<210> 143 

<211> 777 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 143 



cttgacttaa 


tttttttttt aatgtatatt 


aagagacagg 


aggaatacaa 


gtttatgata 


60 


cgtatcgaaa 


acctcagtgt ctcctacaaa 


gaaacgttgg 


cacttaagga 


tatttcacta 


120 


gtgctccatg 


gaccaacaat taccggcatc 


attggtccaa 


acggcgctgg 


gaaatcaaca 


180 


ctattaaaag 


gtatgttggg aattatccca 


catcaaggtc 


aggcatttct 


cgatgacaag 


240 


gaagttaaaa 


aatccttaca ccgaattgcc 


tatgtcgaac 


aaaaaatcaa 


tatcgactac 


300 


aactttccca 


tcaaggtcaa ggaatgcgtc 


tcgttaggac 


tatttccctc 


tattcctctc 


360 


tttcgaagtt 


taaaggctaa acattggaag 


aaagtgcaag 


aggcccttga 


aatcgtcggc 


420 


ctagctgact 


acgctgaacg tcaaattagt 


caactgtctg 


gaggtcaatt 


ccagcgggtc 


480 


ttgattgcca 


gatgtttggt gcaggaagcc 


gactatatcc 


tcttggatga 


accctttgct 


540 
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gggattgact ctgtcagtga ggaaatcatc atgaatacgc tgagagattt gaaaaaagct 600 

gggaagacgg ttctcatcgt tcaccacgac ctcagcaaga ttccccacta cttcgatcaa 660 

gtcttacttg tcaatcgaga agtgattgcc tttggtccaa caaaagaaac ttttaccgaa 720 

accaatctaa aagaagctta cggtaatcaa ctctttttca atggaggtga cctatga 777 

<210> 144 

<211> 897 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 144 



aaggaggtat 


ttatgacata 


ttacgttgca 


afcfcgatatcg 


gtggaaccaa 


catcaagtat 




ggtttggttg 


atcaagaggg 


gcaacttctt 


gaatcgcatg 


aaatgccaac 


tgaggcgcat 




aagggtggac 


ctcatatctt 


acaaaagacc 


aaagatatcg 


tagctagtta 


tttagaaaaa 


180 


ggcccagtag 


caggtgttgc 


catatcttct 


gctgggatgg 


tggatccgga 


taagggtgag 


240 


attttctatg 


ctgggccgca 


aatccctaac 


tacgcaggca 


cccagttcaa 


aaaggaaatc 


300 


gaagaaagct 


ttactattcc 


ttgtgagatt 


gaaaatgatg 


tcaactgtgc 


aggtcttgct 


360 


gaggcagtat 


ctggttcagg 


caagggagca 


agtgtgacac 


tttgcttgac 


cattggaacc 


420 


ggtatcggtg 


gttgcttgat 


tatggatagg 


aaagtcttcc 


atggttttag 


caattcagcc 


480 


tgtgaagtcg 


ggtatatgca 


tatgcaggat 


ggagcttttc 


aagacttggc 


ttctacaaca 


540 


gctttagtga 


aatatgtagc 


tgaagcccat 


ggagaagatg 


ttgatcagtg 


gaatggccgt 


600 


agaattttca 


aagaagccac 


tgaaggaaac 


aaaatctgca 


tggaaggtat 


tgaccgtatg 


660 


gttgactatc 


taggaaaagg 


tctggcaaat 


atttgctacg 


ttgccaatcc 


agaagtggtt 


720 


attcttggtg 


gtggtatcat 


ggggcaagag 


gctatcctca 


aacctaagat 


ccgtacagcc 


780 


ttgaaagagg 


ctttggtacc 


aagtttagca 


gaaaaaacac 


gattagaatt 


tgcccatcac 


840 


caaaatacag 


cagggatgtt 


gggtgcatat 


tatcatttta 


agacaaaaca 


atcctag 


897 



<210> 145 
<211> 690 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 145 

caaaaaagaa aacagtttac aaagaaaaat gatggaggag caaacatggc acaaaaagga 60 
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gtaagcctta 


tcaaggcagc 


atttgataca 


gataactttc 


tcatgcgttt 


tagtgagaag 


120 


gtcttggaca 


tcgtgacagc 


caatcttctt 


tttgtcgtct 


cttgtttacc 


catcgtgacg 


180 


atfcggagfcgg 


ctaaaatcag 


cctctacgag 


accatgttcg 


aagttaagaa 






gtgcctgttt 


ttaaaafccta 


tctaagafcct 










ctgggtfctaa 




aattgtgttt 






tcttttctgg 














tctgattttt 




cttactatcg 


tgatgctggc 


tagttaccct 


atcgcggcac 


gttatgacct 


atcttggaaa 


480 


gaaattcttc 


aaaaaggatt 


gatgttggct 


agttttaact 


ttccttggtt 


cttcctcatg 


540 


ttagccattc 


ttgtcctcat 


tgtgatggtt 


ctttatctgt 


ccgccttcag 


tctactctta 


600 


ggtggctcag 


tcttcctact 


ttttgggttt 


ggactattgg 


tctttatcca 


gactggattg 


660 


atggagaaaa 


ttttcgcaaa 


ataccaatag 








690 



<210> 146 

<211> 915 

<212> DNA 

<213> Streptococcus pneumoniae 



<400> 146 



cccaatcttg 


taaaagaagg 


gagaaggaga 


atggttaaag 


aacgtaattt 


aactcgctgg 


60 


atatttgttt 


tgccagctat 


gattatcgta 


ggattactct 


ttgtttatcc 


gtttttctcg 


120 


agtatttttt 


atagctttac 


caataagcat 


ttgattatgc 


ctaattataa 


atttgttggt 


180 


ttggctaact 


ataaagctgt 


gctatcagat 


cccaacttct 


ttaatgcgtt 


ctttaattca 


240 


attaagtgga 


ccgttttctc 


attagttggt 


caagttttag 


tagggtttgt 


attggcttta 


300 


gctcttcaca 


gagtacgcca 


cttcaagaaa 


ttatatagga 


cattattgat 


tgttccttgg 


360 


gcatttccta 


ccatcgttat 


tgccttctct 


tggcagtgga 


ttctaaacgg 


ggtttatggc 


420 


tacttaccta 


atctaatcgt 


aaaattaggt 


ttaatggaac 


atacacctgc 


atttttgaca 


480 


gatagtacat 


gggcattcct 


atgtttggtg 


tttatcaaca 


tttggtttgg 


agcaccaatg 


540 


attatggtta 


atgtgctttc 


agctttgcaa 


acagtaccag 


aagaacaatt 


tgaggctgct 


600 


aagatagatg 


gtgcttcaag 


ttggcaggtg 


ttcaagttta 


tcgtctttcc 


acatattaaa 


660 


gtggttgtag 


gacttctagt 


tgttttgaga 


actgtatgga 


tctttaataa 


ctttgacatt 


720 


atctacctca 


ttactggtgg 


tggaccagcc 


aatgctacaa 


cgacgcttcc 


aatttttgct 


780 
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tacaacctgg gctggggaac taaattgttg ggtcgtgctt cagcagttac agtactgctc 840 
tttatcttct tggtggcgat ttgctttatc tactttgcta tcatcagtaa gtgggaaaag 900 
gagggtagaa aataa 915 

<210> 147 

<211> 1356 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 147 



tgtagaaaga 


gaagaacgat 


gaaaaaaatg 


agaaagtttt 


tatgtctagc 


tggaattgcg 


60 


ctagcggctg 


ttgccttggt 


agcttgttca 


ggaaaaaaag 


aagctacaac 


tagtactgaa 


120 


ccaccaacag 


aattatctgg 


tgagattaca 


atgtggcact 


cctttactca 


aggaccccgt 


180 


ttagaaagta 


ttcaaaaatc 


agcagatgct 


ttcatgcaaa 


agcatccaaa 


aacgaaaatc 


240 


aagattgaaa 


cattttcttg 


gaatgacttc 


tatactaaat 


ggactacagg 


tttagcaaat 


3 00 


ggaaatgtgc 


cagatatcag 


tacagctctt 


cctaaccaag 


taatggaaat 


ggtcaactca 


360 


gatgctttgg 


ttccgctaaa 


tgattctatc 


aagcgtattg 


gacaagataa 


atttaacgaa 


420 


actgccttaa 


atgaagcaaa 


aatcggagat 


gattactact 


ctgttcctct 


ttattcacat 


480 


gcacaagtca 


tgtgggttag 


aacagatttg 


ttaaaagaac 


ataatattga 


ggttcctaaa 


540 


acttgggatc 


aactctatga 


agcttctaaa 


aaattgaaag 


aagctggagt 


ttatggcttg 


600 


tctgttccgt 


ttggaacaaa 


tgacttaatg 


gcaacacgtt 


tcttgaactt 


ctacgtacgt 


660 


agtggtggag 


gaagcctctt 


aacaaaagat 


cttaaagcag 


acttgacaag 


ccaacttgct 


720 


caagatggta 


ttaaatactg 


ggttaaattg 


tataaagaaa 


tctcacctca 


agattctttg 


780 


aactttaatg 


tccttcaaca 


agctaccttg 


ttctatcaag 


gaaaaacagc 


atttgacttt 


840 


aactctggct 


tccatatcgg 


aggaattaat 


gccaacagtc 


ctcaattgat 


tgattcgatt 


900 


gatgcttatc 


ctattccaaa 


aatcaaagag 


tctgataaag 


accaaggaat 


tgaaacctca 


960 


aacattccaa 


tggttgtttg 


gaaaaattca 


aaacatccag 


aagttgctaa 


agcattctta 


1020 


gaagcacttt 


ataatgaaga 


agactacgtt 


aaattccttg 


attcaactcc 


agtaggtatg 


1080 


ttgccaacta 


ttaaggggat 


tagcgattct 


gcagcctata 


aagaaaatga 


aactcgtaag 


1140 


aaatttaaac 


atgctgaaga 


agtaattact 


gaagctgtta 


aaaaaggtac 


tgctattggt 


1200 


tatgaaaatg 


ggccaagtgt 


acaagctggt 


atgttgacta 


accaacacat 


tattgaacaa 


1260 
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atgttccaag atatcattac aaatggaaca gatcctahga aagcagcaaa agaagcagaa 1320 
aaacaattaa atgatttatt tgaggctgtt cagtag 13 56 

<210> 148 

<211> 2403 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 148 



atgtcttatt 


tcagaaatcg 


ggatatagat 


atagagagga tcagtatgaa 


tcggagtgtt 


60 


caagaacgta 


agtgtcgtta 


tagcattagg 


aaactatcgg taggagcggt 


ttctatgatt 


120 


gtaggagcag 


tggtatttgg 


aacgtctcct 


gttttagctc 


aagaaggggc 


aagtgagcaa 


180 


cctctggcaa 


atgaaactca 


actttcgggg 


gagagctcaa ccctaactga tacagaaaag 


240 


agccagcctt 


cttcagagac 


tgaactttct 


ggcaataagc 


aagaacaaga 


aaggaaagat 


300 


aagcaagaag 


aaaaaattcc 


aagagattac 


tatgcacgag atttggaaaa 


tgtcgaaaca 


360 


gtgatagaaa 


aagaagatgt 


tgaaaccaat 


gcttcaaatg gtcagagagt 


tgatttatca 


420 


agtgaactag 


ataaactaaa 


gaaacttgaa 


aacgcaacag ttcacatgga gtttaagcca 


480 


gatgccaagg 


ccccagcatt 


ctataatctc 


ttttctgtgt 


caagtgctac 


taaaaaagat 


540 


gagtacttca 


ctatggcagt 


ttacaataat 


actgctactc 


tagaggggcg 


tggttcggat 


600 


gggaaacagt 


tttacaataa 


ttacaacgat 


gcacccttaa 


aagttaaacc 


aggtcagtgg 


660 


aattctgtga 


ctttcacagt 


tgaaaaaccg 


acagcagaac 


tacctaaagg ccgagtgcgc 


720 


ctctacgtaa 


acggggtatt 


atctcgaaca 


agtctgagat 


ctggcaattt 


cattaaagat 


780 


atgccagatg 


taacgcatgt 


gcaaatcgga 


gcaaccaagc 


gtgccaacaa 


tacggtttgg 


840 


gggtcaaatc 


tacagattcg 


gaatctcact 


gtgtataatc 


gtgctttaac 


accagaagag 


900 


gtacaaaaac 


gtagtcaact 


ttttaaacgc 


tcagatttag 


aaaaaaaact 


acctgaagga 


960 


gcggctttaa 


cagagaaaac 


ggacatattc 


gaaagcgggc 


gtaacggtaa 


cccaaataaa 


1020 


gatggaatca 


agagttatcg 


tattccagca 


cttctcaaga cagataaagg aactttgatc 


1080 


gcaggtgcag 


atgaacgccg 


tctccattcg 


agtgactggg gtgatatcgg 


tatggtcatc 


1140 


agacgtagtg 


aagataatgg 


taaaacttgg 


ggtgaccgag 


taaccattac 


caacttacgt 


1200 


gacaatccaa 


aagcttctga 


cccatcgatc 


ggttcaccag 


tgaatatcga 


tatggtgttg 


1260 


gttcaagatc 


ctgaaaccaa 


acgaatcttt 


tctatctatg acatgttccc 


agaagggaag 


1320 
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ggaatctttg 


gaatgtcttc 


acaaaaagaa gaagcctaca aaaaaatcga 


tggaaaaacc 


1380 


tatcaaatcc 


tctaccgtga 


aggagaaaag ggagcttata ccattcgaga 


aaatggtact 


1440 


gtcta.ta.cac 


cagatggtaa 


ggcgacagac tatcgcgttg ttgtagatcc 


tgttaaacca 


1500 


gccta.ta.gcg 


acaagggtga 


tctatacaag ggtgaccaat tactaggaaa tatctacttc 


1560 


acaacaaaca 


aaacttctcc 


atttagaatt gccaaggata gctatctatg gatgtcctac 


1620 


agtgatgacg 


acgggaagac 


atggtcagct cctcaagata ttactccgat 


ggtcaaagcc 


1680 


gattggatga 


aattcttggg 


tgtaggtcct ggaacaggaa ttgtacttcg 


gaatgggcct 


1740 






accggtttat acgactaata atgtatctca 


cttagatggc 




tcgcaatctt 


ctcgtgtcat 


ctattcagat gatcatggaa aaacttggca 


tgctggagaa 


1860 


gcggtcaacg 


ataaccgtca 


ggtagacggt caaaagatcc actcttctac 


gatgaacaat 


1920 


agacgtgcgc 


aaaatacaga 


atcaacggtg gtacaactaa acaatggaga 


tgttaaactc 


1980 


tttatgcgtg 


gtttgactgg 


agatcttcag gttgctacaa gtaaagacgg 


aggagtgact 


2040 


tgggagaagg 


atatcaaacg 


ttatccacag gttaaagatg tctatgttca 


aatgtctgct 


2100 


atccatacga 


tgcacgaagg 


aaaagaatac atcatcctca gtaatgcagg 


tggaccgaaa 


2160 


cgtgaaaatg 


ggatggtcca 


cttggcacgt gtcgaagaaa atggtgagtt 


gacttggctc 


2220 


aaacacaatc 


caattcaaaa 


aggagagttt gcctataatt cgctccaaga attaggaaat 


2280 


ggggagtatg 


gcatcttgta 


tgaacatact gaaaaaggac aaaatgccta taccctatca 


2340 


tttagaaaat 


ttaattggga 


atttttgagc aaaaatctga tttctcctac 


cgaagcgaac 


2400 


tag 








2403 



<210> 149 
<211> 636 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 149 

acgatgagac ttgaaattat aaatggacag aaaatttatg ggaaaagacc tattttaaat 60 
cagttgaatt tggtgtttca atcaggaaaa atttatggac ttaaaggtga taatggatct 12 0 
ggcaagacgg ttcttttaaa gatacttgct ggttatatta agcttgacaa aggaaaagtt 180 
cttcaagatg gtaaagttta cggggtaaaa aatcattata ttcaggatgc aggaatttta 240 
attgaaaaag tcgagttttt atctcattta tccctgagag aaaatttgga actgttaagg 300 
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tatttttcat ctaaagttac ggaaaaaaga 

caggaatttg aagacattga ataccgtcat 

thgattcaag cchttatttc ctctccttct 

ttggatgaga agagtgtgag gttaaccaaa 

aatggtctgg ttatcctgac gtcgcacata 

gtattagttg tcgaaaatgg acatatacaa 

<210> 150 
<211> 297 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 150 

cggatgcgtt ccatgacccg tctggcttcc 

gcaaaatgct tctccaaatc ttcaaagttg 

tcctgcatcc gatattggag aatgacctgc 

tgctcggcgc caatatcatc taaaatcagg 

gtcttaacat tgccatcact gatagcattt 

<210> 151 
<211> 1509 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 151 

tcagtgatta tattaaagga gtttaagcct 

aaatcatatg gagcaacacc agcccttgaa 

attgtcggcc ttcttgggcc aaacggctca 

ggcctcttac aaccagatca aggacgtgtc 

accaaggccg ttgtagctta tttgcctgat 

aaagaagccc taacctactt caagaccttc 

catctacttg cagacctggg cattgatgaa 

aacaaagaaa aggttcaact gattttggtt 

gacgaaccca ttggtggggt ggatccagca 

aacaactact caccaacttc taccgttttg 



attgcctatt 


ggattcaata 


ctatgattta 


360 


ttatccttag 


gaacaaagca 


aaaaatggcc 


420 


atactctttc 


tcgatgaacc 


tatgaatgct 


480 


caggtcattt 


tatcttacct 


gaaaaaagaa 


540 


tcggaagata 


tttcagacct 


ttgtacagat 


600 


atgtaa 






636 



caggtttcgt 


catttccatg 


tttcactttc 


60 


aagttggatg 


tgaaaaaggt 


cggtaaattt 


120 


aggatttcgt 


cacgcaccca 


aacggttgat 


180 


acctcagaca 


gcttaatctc 


atccaccaag 


240 


ttgacatcaa 


tgacaaagct 


aggatag 


297 



atgtcattac 


tagtatttga 


aaatgtatcc 


60 


aatgtttctc 


ttgacattcc 


agctggaaaa 


120 


ggaaaaacaa 


ccctgattaa 


actaattaat 


180 


ctcatcaacg 


acatggaccc 


aagcccagca 


240 


acgacctatc 


tcaatgagca 


aatgaaggtc 


300 


tataaagatt 


tcaatcttga 


acgcgcccat 


360 


aatagtcgtc 


tcaagaaact 


atcaaaagaa 


420 


atgagccgtg 


atgctcgtct 


ctatgttttg 


480 


gcccgtgctt 


atatcctcaa 


taccattatc 


540 


atttctaccc 


acttgatttc 


tgatatcgag 


600 



-117- 



WO 02/083855 



PCT/US02/11524 



ccaatcttgg 


atgaaattgt 


cttcctaaaa gacggaaaag tcgtccgtca aggaaatgta 


660 


gatgatattc 


gctacgagtc 


aggtgaatcc attgaccaac tcttccgtca gaatttaagg 


720 


cctaagcaaa 


ggagattatt 


tatgttttgg aatttagttc gctacgaatt taaaaatgtt 


780 


aacaagtggt 


atttagccct 


ctacgcagcc gtgctagtcc tttctgccct catcggaata 


840 


cagacacaag 


gctttaaaaa 


tctaccttac caagaaagtc aggctactat gctacttttt 


900 


ctagctacag 


tctttggtgg 


cttgatgctt acacttggga tttcaaccat tttcttgatt 


960 


attaaacgct 


tcaaaggtag 


tgtctacgac cgacaaggct atctgacttt gaccttgcca 


1020 


gtttctgaac 


accatatcat 


cacagccaaa ctaatcggtg cctttatctg gtcattgatt 


1080 


agcaccgctg 


tattggctct 


aagtgctgtt attattctgg ctttaacagc tccagaatgg 


1140 


attcctcttt 


cttatgtgat 


tacatttgta gaaacacatc tccctcagat ctttcttaca 


1200 


ggtatatcct 


tcctactaaa 


tactatttca ggaatcctct gcatctacct ggctatttcc 


1260 


attggacagc 


ttttcaatga 


ataccgtaca gcactcgctg ttgcagtcta cattggtatc 


1320 


caaatcgtca 


ttggatttat 


tgaacttttc ttcaatctta gttctaattt ctatgtcaat 


1380 


tcactggtag 


gactcaatga 


ccatttctat atgggagcag gtatagccat tgttgaagaa 


1440 


ctcatattca 


tagctatctt 


ttatctcgga acctactaca tcttgagaaa taaggttaat 


1500 


ttgctttaa 






1509 



<210> 152 
<211> 1185 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 152 

aaaagctgtc cgcaagttgt tccagatgtc attgacctct tggtaacacc attcgtgaca 60 
cttttggtca tgtctatcct tggactcttt gtcattggac cagttttcca cgttgttgaa 120 
aactacatcc ttattgctac aaaagcgatt cttagcatgc catttggtct tggtggtttc 180 
ttgattggtg gggttcacca attgatcgtc gtgtcaggtg tgcaccacat cttcaacttg 240 
cttgaagtgc aattacttgc tgctgaccat gctaacccat tcaacgctat catcacagct 3 00 
gctatgacag ctcaaggtgc tgctactgtt gcggttggtg ttaaaacaaa aaatccaaaa 3 60 
ctgaaaacac ttgctttccc ggctgctctt tctgccttcc taggtattac agagcctgct 420 
atcttcgggg tgaacttgcg cttccgtaaa ccattcttcc tttcattgat tgctggtgca 480 
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atcggtggtg 


gattggcttc 


tatccttgga 


cttgctggta 


ctggtaatgg tatcaccatc 


540 


atccctggta 


caatgcttta 


tgttggtaac 


ggacaacttc 


cacaatacct tcttatggta 


600 


gctgtatcat 


ttgcccttgg 


ttttgctctt 


acttacatgt 


ttggttacga agatgaagta 


660 


gacgcaactg 


cagctgcaaa 


acgagctgaa 


gtggctgaag 


aaaaagaaga agttgcgcca 


720 


gcagctcttc 


aaaatgaaac 


acttgtaact 


cctatcgtcg 


gtgatgttgt cgctcttgct 


780 


gatgtcaatg 


acccagtctt 


ctcaagtgga 


gctatgggac 


aaggtatcgt tgtgaaacca 


840 


agccaaggcg 


tggtctatgc 


accagctgat 


gctgaagttt 


caattgcctt tccaacaggg 


900 


cacgcttttg 


gtttgaaaac 


aagaaatggt 


gctgaagttt 


tgattcatgt tggtattgat 


960 


actgtatcta 


tgaacggtga 


cggttttgaa 


acaaaagttg 


ctcaaggtaa taaggtgaaa 


1020 


gctggcgatg 


ttcttggaac 


atttgactca 


aacaaaatcg 


ctgcagctgg acttgatgat 


1080 


acaacaatgg 


ttatcgttac 


aaatacaggt 


gactacgctt 


cagtagctcc agtcgcaaca 


1140 


ggttcagttg 


ctaaggggga 


tgctgtgatc 


gaagtgaaaa 


tctaa 


1185 



<210> 153 

<211> 792 

<212> DNA 

<213> Streptococcus pneumoniae 

<400> 153 



aatcgctttc 


aaacaagaac 


aaaatgttat 


ataaggagat 


ttttgcaaat gaacaatcag 


60 


gaaattgcaa 


aaaaagtcat 


cgatgccttg 


ggcggacgtg 


aaaatgtcaa tagtgttgcc 


120 


cactgtgcga 


ctcgtctacg 


tgtcatggtc 


aaagatgaag 


agaaaatcaa taaagaagtg 


180 


attgagaact 


tggaaaaagt 


tcaaggtgct 


ttctttaact 


cagggcaata ccaaattatc 


240 


tttggtacag 


gtacagttaa 


caaaatgtac 


gatgaagttg 


ttgtacttgg attaccaaca 


300 


tcatctaagg 


atgacatgaa 


agcagaagtt 


gctaaacaag 


ggaactggtt ccaacgtgct 


360 


atccgtactt 


ttggtgatgt 


tttcgttcca 


atcatcccag 


ttatcgtagc gacaggtctc 


420 


ttcatgggtg 


tgcgtggtct 


tttcaacgct 


cttgaaatgc 


cacttccagg tgactttgca 


480 


acttacacac 


aaatcttgac 


agatacagcc 


ttcatcatct 


tgccaggttt ggttgtgtgg 


540 


tcaaccttcc 


gtgtatttgg 


tggaaatcct 


gccgttggta 


tcgttcttgg tatgatgctt 


600 


gtctctggct 


cacttccaaa 


cgcttgggca 


gttgctcaag 


gtggtgaagt aacagcgatg 


660 


aacttctttg 


gtttcatccc 


tgttgttggt 


ttgcaaggtt 


ccgttcttcc agccttcatc 


720 
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atcggggttg 


tcggagctaa atttgaaaaa 


gctgtccgca 


agttgttcca 


gatgtcattg 


780 


acctcttggt 


aa 










792 


<210> 154 
<211> 651 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 154 
acaaaatcaa 


gaattttctg 


tctatttttt 


gaatatttat 


ggagaatgag 


actgatgaaa 


60 


atatggtata 


atgaaataaa 


ggagttttat 


atgcaaaaat 


ttattcaggc 


ttatattgaa 


120 


aagctagatg 


tgacaaccat 


tatcgagaat 


attctaacca 


aggtcatttc 


tcttttactg 


180 


cttttaattg 


tattttatat 


tgctaaaaaa 


atgcttcata 


ccatggtgca 


gagaattgtc 


240 


aaaccttctc 


taaaaatgtc 


tcgtcatgat 


gttggacgcc 


aaaaaaccat 


ctcacgttta 


300 


ctagaaaatg 


tgtttaatta 


tacgctatat 


ttctttttac 


tctactgcat 


tttgtcgatt 


360 


ttaggtttgc 


cagtttctag 


tttgctggct 


ggagctggta 


ttgctggggt 


agcgattggt 


420 


atgggagccc 


aaggctttct 


gtctgatgtc 


atcaatggct 


ttttcatcct 


ctttgaacgt 


480 


caactggatg 


tgggagatga 


ggtcgttctg 


acaaatggac 


cgattactgt 


atcgggtaag 


540 


gttgtcagtg 


tgggaattcg 


tacgacacag 


cttcgtagcg 


aggagcaagc 


ccttcacttt 


600 


gtccctaacc 


gaaatatcac 


agttgttagc 


aatttctcac 


gcacagacta 




651 


<210> 155 
<211> 1815 
<212> DNA 

<213> Streptococcus pneumoniae 










<400> 155 

agaaataaga ggaagaaaat 


ggaacaaaaa 


caccgttcag 


aatttccaga 


gaaggaactc 


60 


tgggacttaa cagccctata 


ccaagaccgt 


gaggatttct 


tgcgtgcaat 


cgagaaagct 


120 


cgcgaagaca tcaaccagtt 


tagccgtgat 


tacaagggca 


atcttcacac 


ttttgaggat 


180 


ttcgagaagg cctttgcgga attggaacag 


atctacattc 


agatgagcca 


tattggcaac 


240 


tatggtttta 


tgcctcagac 


gacggactat 


agcaatgacg 


aatttgccaa 


tattgcccaa 


300 


gctgggatgg 


aatttgaaac 


agatgccagc 


gtagccttga 


ccttctttga 


cgatgccttg 


360 


gtggcagcag atgaggaagt 


cttggaccgt 


ttgggtaaat 


tgccacattt 


aacagctgcc 


420 
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attcgtcagg ctaaaatcaa aaaagcccac tacttagggg cagatgtgga gaaggccttg 48 0 

acaaatctcg gtgaagtttt ctacagtccg caggacattt atactaagat gcgagctggg 540 

gattttgaaa tggctgactt tgaagcccat ggcaagacct acaaaaacag ctttgtgacc 600 

tatgagaatt tctaccaaaa ccatgaggat gctgaggttc gtgagaaatc cttccgttcc 660 

ttctcagagg gacttcgtaa gcaccaaaat acggctgcag cagcchatct ggctcaggtc 72 0 

aagtctgaaa aactcttggc tgatatgaag ggatacgact ctgtctttga ctatcttcta 78 0 

gctgaacaag aagtggaccg tgtcatgttt gaccgccaga ttgacctcat catgaaggac 840 

tttgcaccag tcgctcagag atacctcaag catgttgcca aggtaaatgg tcttgaaaag 900 

atgacctttg cagactggaa attggacttg gacagcgccc tgaatcctga agtgactatt 960 

gacgatgcct atgatttggt catgaagtcg gtagaacctt tggggcaaga atattgtcag 1020 

gaagttgctc gttaccaaga agagcgctgg gtggactttg ctgctaacag tggcaaggat 1080 

tccggtggtt atgcggcgga cccatatcgc gtacaccctt atgtactcat gagctggaca 1140 

ggccgtttga gcgatgtcta taccttgatt catgaaatcg ggcattctgg tcaattcatc 1200 

ttttcagaca atcatcaaag ttacttcaat gcccatatgt cgacctacta tgttgaagca 1260 

ccgtcaacct tcaatgaatt gctactcagt gattacttgg agaaccagtc taatgaccca 132 0 

cgtcaaaaac gcttcgctct ggctcatcgc ttgacagaca cctacttcca taactttatc 1380 

acccacctct tggaagccgc cttccagcgt aaggtgtata cattgattga agaaggggag 1440 

acctttggag caagcaagct caacagcatt atgaaggaag ttttgacgga tttctgggga 1500 

gatgctattg aaattgacga tgatgcaact ctgacttgga tgcgccaagc tcactactat 1560 

atgggcttgt atagttacac ttactcagca ggactagtta tctcgactgc tggttacctt 1620 

catctaaaac attctgaaac tggagctgaa gactggctca atctcctcaa atcaggtggt 1680 

agcaagacac cacttgagtc agccatgatt atcggtgctg atatttcaac agacaaacca 1740 

ctccgtgata ccatccaatt cttgtctgac acagttgacc agattatctc ctatagtgct 1800 

gagttgggag agtag 1815 

<210> 156 
<211> 615 
<212> DNA 

<213> Streptococcus pneumoniae 
<400> 156 
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